DOCUMENT RESUME 



V 



ED 223 475 



SE 040 063 



AUTHOR 
TITLE 

INSTITUTION 
SPONS AGENCY 
PUB DATE 
GRANT * 
NOTE 



AVAILABLE FROM 



PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Anderson, Ronald D.; And Others 

Science Me ta-Analysis Proj- t: Volume I. Final 

Report . 

Colorado Univ., Boulder. 

National Science Foundation, Washington, D.C. 
Dec 82 

NSF-SED-80-12310 

406p.; For related document, see SE 040 064. Contains 
occasional light and broken type. Produced by the 
Laboratory for Research in Science and Mathematics 
Education . 

Laboratory for Research in Science and Mathematics 
Education, c/o Dr. Ronald D. Anderson, Campus Box 
249, University of Colorado, Boulder, CO 80309. 
Reports - Research/Technical (143) 

MF01/PC17 Plus Postage. 

*Acaderaic Achievement; Advance Organizers; Computer 
Assisted Instruction; Elementary School Science; 
Elementary Secondary Education; Individualized 
Instruction; Inquiry; Mastery Learning; Programed 
Instruction; *Science Course Improvement Projects; 
*Science Curriculum; Science Education; *Science 
Instruction; Secondary School Science; *Teaching 
Methods 

*Meta Analysis; National Science Foundation; *Science 
Education Research 



ABSTRACT 

The National Science Foundation funded a project to: 
(1) identify major areas of science education research in which 
sufficient studies have been conducted to permit useful 
generalizations for educational practice; (2) conduct meta-analyses 
of each of these areas; and (3) prepare a compendium of these 
meta-analyses along with interpretative and integrative statements. 
This report constitutes volume I of the compendium. Four separate 
studies are reported following introductory comments on the project. 
These studies and authors are: (1) "The Effects of New Science 
Curricula on Student Performance" (James A. Shymansky, William C. 
Kyle, Jr. and Jennifer M. Alport); (2) "Instructional Systems in 
Science Education" (John B. Willett and June J. MV Yamashita ) ; 
(3)"The Effects of Various Science Teaching Strategies on 
Achievement" (Keven C. Wise and James R. Okey); and (4) "The Effect . 
of Inquiry Teaching and Advance Organizers upon Student Outcomes in 
Science" (Gerald W. Lott) . Each study includes a separate table of 
contents, purpose, methodology, results, conclusions, and supporting 
documentation. ( Author /JN) 



************************************* ********************************** 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
*********************************************************************** 



erjc 



LP* 

C2 



U.S. DEPARTMENT OF EDUCATION 
NATIONAL INSTITUTE OP 60UCATI0N 

eOULATKJNAt RISOURCIS (WORMATfi)N 

Ah ifk-ni Ms ten 

" • • V'M fttt n f! 



Vol uyytB ZJT 



»(»'ij.f.i.»wi j.M'.ty 

• PlMIKof v p* t ,p tn -0^ -rfJtrJ O lhi> iff, w 



SCIENCE META-ANALYSIS PROJECT: FINAL REPORT 
OF NSF PROJECT NO. SED 80-12310 



PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



Ronald D, Anderson 
Stuart R. Kahl 
Gene V. Glass 
Mary Lee Smith 
M. Lynnette Fleming 
Mark R. Mai one 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER <ERIC) " 



Laboratory for Research in Science and Mathematics Education 

Univeriity of Colorado 
Boulder, Colorado 80309 

December, 1982 



This material is based upon work supported by the National Science Foundation 
under Grant No. SED 80-12310. Any opinions, findings, and conclusions or 
recommendations expressed in this publication are those of the authors and 
do not necessarily reflect the views of the National Science Foundation. 



O 




TABLE OF CONTENTS 



Introduction to the Science Education Meta-Analysis Project 

Ronald D. Anderson, Stuart R. Kahl, Gene V. Glass and 
Mary Lee Smith 

The Effects of New Science Curricula on Student Performance 

James A. Shymansky, William C. Kyle, Jr. and Jennifer M. Alport 

Instructional Systems in Science Education 
John B. Willett and June J. M. Yamashita 

The Effects of Various Science Teaching Strategies on Achievement 

Kevin C. Wise and James R. Okey 

The Effect of Inquiry Teaching and Advance Organizers Upon Student 
Outcomes in Science 

Gerald W. Lott 

Research Science Teacher Education Practice Associated With Inquiry 
Strategy 

Gary L. Sweitzer ^ 

Science Teacher Characteristics By Teacher Behavior and By Student 
- Outcome 

Cynthia Ann Druva 

The Relationship of Student Characteristics and Student Performance 
in Science 

M. Lynnette Fleming and Mark R. Ma lone 
A Consolidation and Appraisal of Metaanalyses 

Ronald D. Anderson 
Appendices 

A - Codinq Forms 

B - Bibliography of Studies Coded 



PREFACE 

Science education probably has the longest and richest 
tradition of reseairch of all the subject fields in education. 
With a background in one of the natural sciences, it is not 
surprising that university personnel in science education 
would have been interested in empirical studies. The interest 
was strong enough that over a half century ago they formed 
their own research organization, long before analogous * research 
organizations were commonplace in other subject areas. The 
interest persists; well over 3000 dissertation studies alone 
have been conducted in science education since mid-century. 
Science educators have been interested in conducting research 
and in examining its implications for classroom practice. 

But when translating research results into practice, science 

educators have faced the ^same difficulties encountered by other 

I 

educational researchers and scholars in all areas of social 
research, namely, how can you integrate the many findings 
acquired from varying research settings and having conclusions 
with less than perfect agreement. The numerous variables and 
less than perfectly controlled situations common to all social 
research have left science education with the task of finding 
meaning in a complex set of research findings. 

Quantitative procedures for integrating research findings 
give hope that the extant body of research literature can be 
given increased meaning and clearer implications for practice. 



The recent emergence of meta-analysis led to its application 
in several places in science education and finally to a proposal 
to the National Science Foundation seeking support for the 
project reported herein . Its purpose was to (1) identify the 
major areas of science education research in which sufficient 
studies have been conducted to permit useful generalizations 
for educational practice, (2) conduct metaanalysis of each 
of these areas, and (3) prepare a compendium of these meta- 
analyses along with interpretive and integrative statements. 
This report constitutes that compendium. 

A project of this scope, of course, involves a large 
number of people and acknowledgment of their extensive efforts 
is gratefully given. Although varied in terms of their role 
and involvement, each made important contributions to the 
overall endeavor. 

The local project staff included Stuart R. Kahl , who 
served as associate director during its first year. His 
work in coordinating the literature searches and coding 
form development, as well as in preparing common data file 
formats and serving as a statistical consultant to the 
research assistants, was of major importance. Other local 
staff included Gene V Glass and Mary Lee Smith, developers 
of the meta-analysis technique, who were a guiding force in 
the development of the project, trained the research assistants, 
and served as consultants to them in their work. Other key 
people in the project were the secretaries, Ellen Ward and 
later Lisa Hamilton, whose sterling performances were invaluable 

The staff extended far beyond the University- of Colorado 



3 



and included personnel from six other universities. The 

original participants were an advisory committee consisting 

of one 'person from each institution as follows: 

J, Myron Atkin , Stanford University 

Robert Howe, Ohio State University 

James Okey, University of Georgia 

Lee Shulman, Michigan State University 

James Shymansky, University of Iowa 

Wayne Welch, University of Minnesota 

All played an important role in shaping the project at its 

inception; two of them, James Okey and James Shymansky, 

continued in the project as researchers. 
♦ 

The extensive work of the many researchers involved in 
this project is reflected in their authorship of the several 
chapters of this report. In addition to persons already 
mentioned they include a cadre of people appointed as research 
assistants at each of the institutions involved. Among this 
staff were William C. Kyle, Jr. and Jennifer M. Alport of the 
University of Iowa, John B. Willet and June J. M. Yamashita 
of Stanford University, Keven C. Wise of the University of 
Georgia, Gerald W. Lott of Michigan State University, Gary 
L. Sweitzer of Ohio State University,' Cynthia Ann Druva of the 
University of Minnesota and Mark R # Malone and M. Lynnette 
Fleming of the University of Colorado, Their work reflects 
not only the many hours required in the labor-intensive meta- 



ERJ.C 



4 



analysis process but their high level of professional competence 
and scholarly ability. 

RDA 



9 

ERIC 



INTRODUCTION TO THE SCIENCE EDUCATION 
META-ANALYSIS PROJECT 



Ronald D. Anderson 

Stuart R. Kahl 

Gene V, Glass 

Mary Lee Smith 

Laboratory For Research in Science 
Mathematics Education 

University of Colorado 

Boulder, Colorado 80309 



0 



SCIENCE EDUCATION: A META-ANALYSIS OF MAJOR QUESTIONS 

While meta-analysis (Glass, 1976) has been on the educa- 
tional research scence for only a few years, it has become 
established as an important technique. It is proving useful 
in translating the results of numerous studies on a particular 
topic into a concise form that is reflective of the multiplicity 
of data found in the many studies, and underst anc^ible to the 
educational practitioner who may be in a position to apply 
the results. The characteristics of this methodology and 
guidelines for employing it are well documented (Glass, 
McGaw, and Smith, 1981). While this approach already has been 
utilized for several science education questions, it has 
additional potential value if applied to the wide sweep of 
major science education research questions in a systematic 
manner. Such an approach requires focusing on the major 
research questions in the field, giving attention to various 
subquestions subsumed under each major question and examining 
common themes that cut across the major questions. 

A project of this design was conducted under National 
Science Foundation Grant No. SED 80-12310. Within the 
conceptual framework described above, a large number of 
research studies were integrated with t K e results providing 
a basis for interpretive and integrative statements about 
the major questions addressed in the science education 
research literature . 

ERIC j 



\ A MULT I- INSTITUTIONAL ENDEAVOR 

Although primarily conducted at the University of Colorado, 
major portions of the project work were done under a multi-insti- 
tutiot al arrangement involving researchers from six other insti- 
tutions. A leading researcher from each of these institutions 
constituted an advisory committee to aid in identifying the 
research questions pursued and assisted in designing an 
endeavor encompassing the work of one or more researchers 
from their home institutions in this project. The actual 
coding and analysis work was conducted by researchers located 
at the indicated six research centers and the University of 
Colorado. At each location an individual or a team of up to 
three researchers conducted this work. 

Prior to beginning this coding and analysis work, all 
of the researchers attended a week-long session for training 
and coordination of work. During this time each individual 
or team developed the initial version of the coding forms 
with i large percentage of the categories and format in common. 
This process resulted in a data base which can be examined 
across research questions. 

This multi-institutional approach had both advantages 
and disadvantages. It was possible to involve a large research 
group which was not already extant at one institution. It 
had further advantage of stimulating meta-analysis work in 
a variety of locations where in many cases it was not already 
underway. One of the disadvantages was the inability to 
readily shift manpower among questions as their scope became 
more clearly identified during the actual coding process. As 

iu 



a result there is variation in the thoroughness with which 
the literature has been sampled for each of the research 
questions. Though this variation is identified here as a 
disadvantage, it is not a serious problem as indicated in a 
later section of this report. 

IDENTIFYING THE. RESEARCH QUESTIONS 
The first step in the project was to identify the major 
science education question's to pursue. It was accomplished 
by a combination of (a) empirical analysis of the extant 
research, and (b) expert judgment as to the importance of 
particular questions. Major attention was given to the 
empirical analysis rather than the expert judgment, however, 
in that the basic approach was to include whatever empirical 
analysis showed to be the subject of a substantial number of 
research investigations. 

The first step was initiated by collecting and examining 
a representative sample of science education research studies. 
Literature was sampled across time and type publication and 
included studies from The Journal of Research in Science Teaching , 
Science Education, Dissertation Abstracts , and the most recent 
abstracts of presentations for the National Association for 
Research in Science Teaching annual convention. About 300 such 
research articles were sampled, and the major (as well as 
subsidiary) questions addressed recorded. 



ERjC lx 



The questions collected were then classified into some broad, 
general categories. Five persons classified separate portions of 
the questions into categories. These categories, developed independ 
ently by each of the five persons, had much in common. The entire 
group of five then examined the questions and organized them into 
a simple classification system. It resulted in thirteen general 
areas encompassing all but a small percentage of studies which 
neither fit within these thirteen categories nor constituted a 
meaningful grouping themselves. 

The researchers then went back to the literature (including 
the Curtis digests of Research in Science Education of several 
decades ago) to see if. additional research questions fit within 
the framework that had been empirically derived. This cross- 
validation indicated the categories were appropriate. 

The next step was to develop a full description of each of 
these thirteen areas. They were identified by a generic question 
for each area along with sample subquest ions . These sample sub- 
questions were examples of a larger set of such subquest ions ; they 
were a representative and not exhaustive set. Ir« addition, defini- 
tions of terms, descriptions of some variables, and a limited 
rationale for considering the questions were provided. 

A form was then developed on which responses could be obtained 
from other science education researchers concerning these categories 
Twenty people were mailed a full description of the thirteen areas, 
a response form, and a cover letter requesting that they be prepared 
to discuss the material by phone. All twenty people responded to 
a telephone request for, their judgments on the relative importance 
of these questions and the adequacy of the literature for doing a 



10 

meta-analysis. While these judgments of the relative importance 
of the questions were of value, the judgments of the relative 
importance of the questions were largely subordinated by ar 
empirical search of the total science education research. 

Literature searches were conducted on a sampling basis to 
obtain an estimate of the size of the literature and determine 
if sufficient studies existed for a meta-analysis of each question. 
Abbreviated computer searches were conducted using data bases such 
as ERIC, Dissertation Abstracts , and Social Science Research . The 
citations obtained then were screened to eliminate those items 

J 

which were not research publications. Subsequent investigation 
indicated some problems with the manner in which the computer 
searches had been conducted, so additional searches were done 
"by hand" as a check. They were done on a sampling basis using 
selected annual reviews of science education research and Science 
Education - A Dissertation Bibliography , a listing of all doctoral 
dissertations pertaining to science education conducted between 
3 950 and 1977. These procedures provided a rough estimate of the 
size of the literature pertaining to each of the thirteen questions. 

At this point a/two-day conference of the advisory committee 
was qonvened to confer with the project staff and produce a final 
classification of research questions for meta-analysis as well as 
identify important variables to include when integrating the research 
for each question., 

One of the original questions ("What are the goals and priorities 
of science education?") was eliminated due to an insufficient number 
of empirical studies, eve.n though it was ranked high in importance. 
The other twelve questions were recombined into a broader set of 
0 -estions as follows: 



II 

I. What are the effects of different curricular programs 
in science? 

II. What are the effects of different instructional systems 
used in science teaching (e.g. programmed instruction, 
master learning , departmentalized instruct ion ) ? 
III. What are the effects of different teaching techniques 
(e.g. questioning behaviors, wait-time, advance or- 
ganizers, testing practices)? 
IV. What are the effects of different pre-service and in- 
service teacher education programs and techniques? 
V. What are the relationships between science teacher 
characteristics and teacher behaviors or student 
outcomes? 

VI. What are the relationships between student characteristics 
and student outcomes in science? 
While these six questions as stated were pursued initially, 
some of them were deljmived further when subsequent search activities 
made it clear that they were too broad to complete within the resources 
of the project. 

THE LITERATURE SEARCH PROCESS 
Identifying and collecting the research studies to be part 
of a meta-analysis is a major step in the total endeavor. This 
aspect of the project will be described in terms of the (a) limita- 
tions placed on the studies to be included, (b) search strategies 
emplo^fed, and (c) variations in the literature covered among the 
majc^ questions within the total project. 



12 



Restrictions on Scope of the Quescions 

Because of the need to keep the meta-analysis to a manageable 
size and to maintain some degree of commanality among the studies 
included under a particular question, the following restrictions 
were placed on the studies to be included. 

1. The studies were limited to those conducted in the 
context of grades K through 12. 

2. The studies included were limited to those conducted 
within the United States. 

3. For questions I-IV, only those with a control group 
were included. 

4. The studies were limited to those published in 1950 
or later. 

- The Search Process 

In a departure from many past meta-analyses, it was decided 
that the search process would begin with dissertations because 
of the thoroughness with whi-ch data are typically reported therein, 
and because such a large percentage of research studies are con- 
ducted within that context. This process of searching dissertations 
was greatly facilitated by the existence of the previously mentioned 
bibliography which lists all doctoral dissertations pertaining to 
science education conducted between 1950 and 1977. This document 
lists approximately 3,200 science doctoral dissertations; the entire 
document was systematically examined to identify each potential 
dissertation which, by title anu categorization within the biblio- 
graphy, appeared to be a potential for the meta-analysis. These 
approximately 1,000 dissertations were obtained on microfilm from 

ERJC > , ■ 



13 



the Science and Mathematics ERIC center at Ohio State University. 
Each dissertation was read to determine if it actually pertained 
to the topic at hand and, if so, it was utilized in the meta- 
analysis. ' 

Another facet of the search process was screening the biblio- 
graphies in each coded publication to identify additional studies 
to be included in the meta-analysis. In addition to identifying 
journal articles through this standard bibliographic search method, 
ERIC searches and simple screening of the entire collection of 
issues for the relevant years of selected journals were conducted. 
Among the various research sites, the procedures for identifying 
journal reports to be included varied considerably. Whatever 
mechanism was used, a high percentage of the articles located 
were reports of studies already coded from dissertations. Finally, 
some studies utilized in this meta-analysis were reported in other 
sources such as books or unpublished reports. 

Variations in Literature Covered 

While there was considerable variation in the amount of litera- 
ture covered among the several research sites, there was consistency 
in removing many studies from consideration without coding them 
once they had been read and their exact character ascertained. 
While 769 studies were coded, nearly 2,000 studies were read in 
the process. Among the reasons for excluding studies were the 
following. 

a. The most common reason for eliminating a study was 
inadequate reporting, i.e., not enough information 



ERIC 



14 

was provided to make it possible to calculate an 
effect size . 

b. The study did not utilize a control group. 

c. The study was not within the K-12 limit; most studies 
eliminated were college level. 

d. The study was conducted outside the United States. 

Even given this limiting of the studies included, many of 
the researchers were faced with a body of literature larger than 
was possible for them to code and analyze completely within their 
time limitations. The means of limiting the number of studies 
varied from one site to another but generally were one of the 
following threo> approaches. (a) Some sites found it possible 
to code and ana.yze essentially the entire body of literature 
located through the search procedure described above and contained 
within the boundaries cited earlier. (b) Some sites chose to 
limit the scope of their original question to one or more key 
subquest ions . (c) Some maintained the scope of their coverage 
but selected only a portion of the studies for analysis. 

CODING THE STUDIES 
Meta-analysis endeavors are very labor-intensive; the most 
time consuming part is reading each study and recording on the 
coding sheets each relevant piece of information. Of the dozens 
of items of information potentially available for a given study, 
the major one is an effect size that provides a quantitative 
comparison of the effects of the experimental and control group 
O or in th<* case of a correlational study, the correlation between 

l IV 



15 



two variables). For an experimental study, an effect size is 
calculated which provides a normalized measure of the difference 
in performance of the two groups with respect to a specified de- 
pendent variable such as achievement, attitude toward science, 
or any other outcome variable. Symbolized by the Greek letter 
A and abbreviated E.S. , effect size is defined as the mean 
difference between the given variable for the experimental 
group and control group divided by the standard deviation of 
the control group. 

A = *+ - f c 

where X + = mean of experimental group, 
X c = mean of control group, and 
S c = standard deviation of the control group. 

The calculations involved in determining the effect size 
vary considerably depending upon the particular form of the data 
reported in a given study. The numerous procedures required in 
the various situations are well developed (Glass, McGaw, and 
Smith, 1981) . 



INTEGRATING THE RESULTS 

Once the coding (recording information on all demographic, 

independent and dependent variables available in the report) for 

all of the studies in the meta-analysis has been completed, 

attention is turned to integrating this information. This step 

involved calculating an average effect size (A, a simple arith- 
O 

ERLC 



J.t> 



metic average) from all those obtained on a given outcome variable 
such as achievement (and/or some particular category of achievement), 
attitude toward science, laboratory skills or whatever outcome 
variable has been examined within some subset of the studies 
involved. Furthermore an average effect size can be calculated 
for a particular outcome variable from all studies with a parti- 
cular independent variable and this average effect size then can 
be compared to the average effect size on the same outcome variable 
for those studies having a different independent variable. For 
example, in the meta-analysis of studies of instructional systems 
in science (at K-12 levels) the average effect size on cognitive 
achievement for 5 studies of audio-tutorial systems was .09 standard 
deviations higher than th3 control groups, while the average effect 
size on cognitive achievement for 7 studies of "Keller Plan" 
systems was .49 standard deviations higher than their control 
groups. This same type of comparison can also be made 'or other 
outcome variables. For example, one of the audio-tutorial studies 
had an affective measure, it was an effect size of .33 in favor of 
the experimental group. Two* "Keller Plan" studies had an affective 
measure with an average effect size of .52. Similar statements 
can be made about these two instructional systems with respect 
to any other outcome measures included in some of the studies 
and similar comparisons can be made with other instructional 
systems with respect to any outcome measures included in studies 
of these systems. 

A variety of issues have been raised about the interpretation 
of such results as described above. For a discussion of the issues 
the reader is referred to a recent article (Glass, 1982) or book 
on the topic (Glass, McGaw, and Smith, 1981). 



er|c , 



17 



PROJECT RESULTS 

The results of the meta-analysis in this project are reported 
in the following chapters of this report. They include one chapter 
associated with each of the previously identified questions (two 
chapters in the case of question III) and a chapter dealing with 
research issues for which data is drawn from one or more of the 
separate meta-analyses. Brief descriptions of the data files 
acquired are provided in each of the individual research papers. 
Copies of the coding sheets used and the complete bibliography 
of research studies coded are provided in the appendices of this 
report of the project. The total data base has been compiled 
on one master file at the University of Colorado and is available, 
along with a User ! s Manual (Kahl, Anderson, 1982), to other 
researchers who wish to use it. 

REFERENCES 

Glass, G.V. Primary, Secondary, and Meta-Analysis of Research. 
Educational Researcher , 1976, 5(10), 3-8. 

Glass, G.V., McGaw, B.V. & Smith, M.L. Meta-Analysis in 
Social Research. Beverly Hills, California: Sage Publications, 
1981. 

Kahl, S.R., & Anderson, R.D. Science Meta-Analysis Project: • 
User's Guide for the Machine-Readable Raw Data File. Boulder, 
Colorado: Laboratory for Research in Science and Mathematics 
Education, University of Colorado, 1982, 138 pages. 

University Microfilms International. Science Education - A 
Dissertation Bibliography. Ann Arbor, Michigan: Dissertation 
Publishing, University Microfilms International, 1978. 

ERLC > - 



18 



THE EFFECTS OF NEW SCIENCE CURRICULA 
ON STUDENT PERFORMANCE 



by 

Jame*s A. Shymansky, William C. Kyle* Jr. 
Science Education Center 
The University of Iowa 
Iowa City, Iowa 52242-1478 

and 

Jennifer M. Alport 
Department of Biological Sciences 
University of Natal 
Natal, Republic of South Africa 4001 



THE EFFECTS OF NEW SCIENCE CURRICULA 
ON STUDENT PERFORMANCE 

An Abstract 

Elementary, junior high and secondary school science experienced 
a tremendbus curriculum development and growth beginning in the late 
1950 f s, through the early 1970 f s, that can be described only as 
phenomenal. * Several groups of concerned scientists and educators 
developed modern science programs with a major emphasis on the nature, 
structure, and unity of science while accentuating the investigative, 
exploratory phases of science, and the development of scientific 
inquiry. In contrast to these new curricula, "traditional" courses 

r 

generally tended to concentrate on the knowledge of scientific facts, 
laws, theories, and technological applications (Haney, 1966; Klopfer, 
1971; Schwab, 1963). 

The public became very science and technology conscious following 
the historic launching of Sputnik I by the Soviet Union on October 4, 
1957. The numerous "alphabet-soup" curricula which were developed 
as a result of public outcries and financial support from federal 
agencies and private foundations were aimed at rekindling student 
interest in science and upgrading the lethargic science curriculum in 
the schools. Morris Shamos, a noted physicist, science educator, and 
curriculum director, estimated thac 5 billion dollars were spent to 
improve K-12 science education during the 15 years following Sputnik I 



20 



(Yager, 1981a)* A substantial amount of this support was from the 
National Science Foundation (NSF). 

Since the inception of the NSF sponsored curriculum development 
era there have been' numerous evaluation efforts to assess the impact of 
the new science curricula versus traditional science courses. The 
question as to whether the newly developed curricula were any "better 11 
than the traditional courses became a leading issue in science educa- 
tion. The large body of research on the effects of the new curricula 
is generally viewed as inconclusive. A brief scan through the litera- 
ture reveals that some studies claim that the new curricula facilitate 
cognitive and/or affective achievement while others claim that they 
do not. Thus, after 25 years of sporadic implementation, the question 
of how effective new science curricula actually are in enhancing 
student performance is still unanswered. 

This study utilizes the quantitative synthesis perspective to 
research integration known as meta-analysis (Glass, 1976) to synthesize 
the results of 105 experimental studies involving 45,626 students. 
Thus, this study is a quantitative synthesis of the retrievable primary 
research focusing on the effects of new science curricula on student 
performance. A total of 27 new science curricula involving one or 
more measures of student peirformance are included in this meta-analysis. 

Data were collected for 18 a priori selected student performance 
measures. These 18 criterion variables were grouped into 6 criterion 
clusters as follows: 



ER?C 



2o 



/ 



1. General Achievement} Cluster \ 

a, Cognitive - low 

b. Cognitive - high 

^ c. Cognitive - mixed/general achievement 

2. "perceptions Cluster 

a. Affective - attitude toward subject 

b. Affective - attitude toward science 

c. Affective - attitude toward procedure/methodology 

d. Self-concept 

3. Process Skills Cluster 

a. Process skills 

b. Methods of science 

H. Analytic Skills Cluster 

a. Critical thinking 

b. Problem solving 

5. Related Skills Cluster 

a. Reading 

b. Mathematics 

c. Social studies 

d. Communication skills 

6. Miscellaneous 

a. Creativity 

b. Logical thinking (Piagetian) 
c* Spatial relations (Piagetian) 

In addressing the overall question of new science curriculum 
effectiveness, the data are arranged in three broad categories: cur- 
riculum characteristics, student or teacher factors, and study design 
features. The variable analyzed in all cases is student performance 
measured in terms of the meta-analysis common metric known as effect 
size (Glass, 1976), The effect size is a common metric derived from 
the various tests of student performance in all the studies analyzed 
and provides a basis for comparison across the many studies addressing 
the broad question of curriculum effectiveness. 

The results of this meta-analysis reveal definite positive 
patterns of student performance in new rcience curricula. Across all 
new science^ curricula analyzed, students exposed to new science 



9 

ERLC 




22 



curricula performed better than their traditional counterparts in 
achievement, analytic skills, process skills, and related skills, while 
developing a more positive attitude toward science. On a composite 
basis, the average student in new. science curricula exceeded the per- 
formance of 63% of the students in traditional science courses. 

Further breakdowns of the student performance data reveal other 
interesting characteristics of new science curricula. For example, new 
science curricula in biology (i.e., the BSCS programs) produced the 
most positive performance scores among the science disciplines, while 
chemistry and earth science curricula appear to have had the least 
positive impact. Also, studies involving new science curricula judged 
to have a low emphasis on laboratory activity showed students out- 
performing their traditional course counterparts by larger margins 
overall than those new science curricula judged to have a high labora- 
tory emphasis. On the other hand, studies involving new science 
curricula judged to have a high emphasis on process skill development 
shoved students out-performing traditional course students by larger 
margins on analytic skill measures than those involving curricula 
judged to have a low process skill emphasis. 

In terms of overall performance, science curricula produced 
equally positive results when broken down by grade level (K-5, 7-9, 
10-12, post secondary). However, student performance in new science 
curricula was significantly enhanced where mixed samples of male and 
female students were studied compared to either predominantly male or 
female samples . 



ERIC 



CO 



23 



Finally, the quantitative synthesis revealed that student per- 
formance in new science programs was adversely affected when teachers 
received inservice or preservice training in the use of the new curric- 
ulum macerials. Alternative explanations for this and other findings 
are thoroughly discussed in this study. 



c 



2u 



6 

TABLE OF CONTENTS 

Pag 

LIST OF TABLES 26 

LIST OF FIGURES 28 

INTRODUCTION 29 

STATMENT OF THE PROBLEM 31 

BACKGROUND 35 

PROCEDURES 46 

Description cf the Search Methods 46 

Studies Included , 48 

Coding Variables 50 

Background and Coding Information . . ' * . . 51 

Sample Characteristics * 52 

Treatment Characteristics 55 

Teacher. Characteristics 57 

Design Characteristics ... 59 

Outcome '-Characteristics * 61 

Effect Size Calculations c 62 

Reliability ; * , . . 63 

Procedures Regarding Effect Size Calculation 64 

Methods and Data Analyses , 66 

RESULTS AND DISCUSSION . . . 68 

Curriculum Characteristics v 68 

Achievement Cluster ! 77 

Student Performance 81 

Process Skills 81 

Analytic Thinkiog * • 84 

Related Skills I 86 

Other Performance Areas 87 

Life Sciences 89 

Physical Science * 8? 

General Science • • 90 

Earth Science 91 

Biology ...... I * 91 

Chemistjry .92 

Physics\ '* 93 

Inquiry Emphasis 96 

Process Emphasis 9R 



25 



; % "* — Page 

Laboratory Emphasis 99 

Individualization Emphasis 100 

Content Emphasis " 100 

Study Characteristics 121 

SUMMARY l3l 

BIBLIOGRAPHY I37 



ERIC 



26 



26 



LIST OF TABLES 

TABLL' pag 

1 DISTRIBUTION OF EFFECT SIZES (A's) BY SOURCE OF STUDY ... 49 

2 DISTRIBUTION OF STUDENT SAMPLE BY SOURCE OF STUDY 50 

3 CURRICULUM PROFILE <, 58 

4 EFFECT SIZE DATA FOR COMPOSITE STUDENT PERFORMANCE 

MEASURES BY CURRICULUM 69 

5 EFFECT SIZE DATA FOR PERFORMANCE CRITERIA 72 

6 EFFECT SIZE DATA FOR CRITERION CLUSTERS ACROSS ALL 
CURRICULA 77 

7 EFFECT SIZE DATA FOR THE ACHIEVEMENT CRITERION CLUSTER 

BY CURRICULUM 79 

8 EFFECT SIZE DATA FOR THE PERCEPTIONS CRITERION CLUSTER 

BY CURRICULUM 82 

9 EFFECT SIZE DATA FOR THE PROCESS SKILLS CRITERION 

CLUSTER BY CURRICULUM 83 

10 EFFECT SIZE DATA FOR THE ANALYTIC THINKING CRITERION 

CLUSTER BY CURRICULUM 85 

11 EFFECT SIZE DATA FOR THE RELATED SKILLS CRITERION 

CLUSTER BY CURRICULUM 87 

12 EFFECT SIZE DATA FOR THE OTHER MENTAL FUNCTIONS CRITERION 
CLUSTER BY CURRICULUM * 88 

13 EFFECT SIZE DATA FOR PERFORMANCE CRITERION CLUSTERS 

BY CONTENT OF CURRICULA 94 

14 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY CURRICULUM 
PROFILE: EMPHASIS ON INQUIRY 101 

15 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY CURRICULUM 
PROFILE: EMPHASIS ON PROCESS SKILLS 102 

16 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY CURRICULUM 
PROFILE: EMPHASIS ON LABORATORY 103 

\ 

2'J 



'27 



TABLE page 

17 EFFECT SIZE DATA FOR CRITERION CLUSTERS 'BY CURRICULUM 

PROFILE: EMPHASIS ON INDIVIDUALIZATION i04 



18 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY CURRICULUM 
PROFILE: EMPHASIS ON CONTENT 



21 EFFECT SIZE DATA FOR CRITERION VARIABLES BY STUDENT 



22 ANOVA SJMMARY FOR EFFECT SIZE DATA GROUPED BY CRITERION 
CLUSTER AND SAMPLE GENDER 

23 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY SCHOOL TYPE" 
ACROSS ALL NEW SCIENCE CURRICULA 

24 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY STUDENT 



30 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY LENGTH Or 
TREATMENT 

31 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY FORM OF 
PUBLICATION 



105 



19 CORRELATIONS BETWEEN CURRICULUM PROFILE RATINGS AND 

EFFECT SIZES CALCULATED FROM STUDENT PERFORMANCE DATA ... 10 6 

20 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY GRADE LEVEL 
ACROSS ALL NEW SCIENCE CURRICULA 



107 



GENDER ACROSS ALL NEW SCIENCE CURRICULA >m 



114 



115 



SOCIO-ECONOMIC STATUS ACROSS ALL NEW SCIENCE CURRICULA . . u7 



118 



25 ANOVA SUMMARY FOR EFFECT SIZE DATA GROUPED BY CRITERION 
CLUSTER AND SOCfO-ECONOMIC STATUS 

26 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY INSERVICE 
EXPERIENCE ACROSS ALL NEW SCIENCE CURRICULA ........ 120 

27 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY PRESERVICE 
EXPERIENCE ACROSS ALL NEW SCIENCE CURRICULA 122 

28 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY RATED LEVEL 

OF INTERNAL VALIDITY 124 

29 EFFECT SIZE DATA FOR CRITERION CLUSTERS BY TYPE OF TEST 

USED 126 



127 
130 



0 

ERIC 



'30 



0 



28 



LIST OF FIGURES 

FIGURE p age 

1 BAR GRAPH OF THE MEAN EFFECT SIZES FOR CRITERION 

CLUSTERS ACROSS ALL CURRICULA 78 

2 BAR GRAPH OF THE MEAN EFFECT SIZES FOR COMPOSITE 
PERFORMANCE BY GENDER H2 



ERJC 3 X 



29 



THE EFFECTS OF NEW SCIENCE CURRICULA 
ON STUDENT PERFORMANCE 

INTRODUCTION 

Since 1955, and particularly during the 1960's and early 1970' s, 
elementary, junior high, and secondary school science curricula 
experienced considerable growth and substantial change which can be 
described only as "phenomenal. 11 It is generally accepted that the 
launching of Sputnik I on October U, 1957 stimulated this sudden 
growth and concomitant curriculum development. In an attempt to "make 
up lost ground" in the technological race with the Soviet Union, 
American scientists and educators initiated an all out effort to up- 
grade science curricula and science instruction. The public became 
very science and technology conscious during this period of time as 
federal agencies and private foundations provided financial support 
for the resulting wave of new science curricula. 

Ney science programs emerged quickly for high school physic's, 
chemistry, and biology. The development soon encompassed the junior 
high and elementary science programs. Within 15 years of the his- 
torical launching of the Russian satellite, dozens of "alphabet-soup 11 
science curricula were developed including such well-known programs 
as PSSC, CBA, BSCS, CHEH Study, ESS, S-APA, SCIS, and ESCP; as well 
as other lesser known programs such as COPES, ISLI and IS. One noted 



30 

* 

scientist and educator, Morris Shamos, estimated that approximately 
5 billion dollars were spent on K-12 science improvement during the 
15 year post-Sputnick era (Yager, 1981a)* 

A complete set of goals and objectives for the new science curric 
ula were never really articulated by the numerous new curriculum 
designers. The prevailing notion, however, was that the traditional 
courses which tended to concentrate on the knowledge of scientific 
facts, laws, theories, and technological applications ^were somehow 
ineffective in developing the creative genius needed to forge ahead 
in a rapidly evolving scienctific world (Haney, 1966; Klopfer, 1971; 
Schwab, 1963). . The new science curricula were supposed to rekindle 
student interest in science and accelerate the development of a 21st 
century science perspective by emphasizing the structure and process 
of science. Rather than allowing the students to get bogged down in 
the rhetoric of conclusions as was the pattern with traditional 
courses, the new curricula were to stress doing science and learning 
how to learn. New science curricula qu\ckly came to be associated 
with process objectives and skills while traditional science curricula 
were tabbed as being fact-oriented. The process versus product 

characterizations of new and traditional science curricula still per- 

» 

/ 

sists. 

After nearly 25 years, and over 5 billion dollars, the question 
of how effective new science curricula actually were in enhancing 
student performance is still unanswered. Money for continued develop- 
ment of science programs has been withdrawn and public sentiment 



31 



apparently favors a move back to the basics. This move back to the 
basics would imply support for more traditional, fact-oriented science 
courses. However, the decisions to withdraw support for curriculum 
development and implementation and to re-emphasize traditional course 
objectives should be based on a careful examination of evidence, not 
on some gravity- like force that moves the curriculum pendu.|jum back 
and forth. This report addresses the question of new science curric- 
ulum effectiveness by meta-analyzing the results of many studies which 
have addressed this issue during the past 25 years. 

STATEMENT OF THE PROBLEM 

This study was designed to synthesis quantitatively the collec- 
tive research dealing with the effects of new science curricula on 
student performance ♦ This meta-analysis incorporates large numbers of 
studies pertaining to the overall assessment and evaluation of new 
science curricula (versus traditional courses) on student performance. 
The meta-analysis approach to research integration, developed by 
Glass (1976), applies the attitude of data analysis to quantitative 
summaries of individual studies. Meta-analysis is a statistical 
analysis of the results of a large number of analyses of original 
research on a common topic. 

For the purpose of this report, new science curricula are defined 
as those courses or curricular projects which: 

a) were developed after 1955 (with either private or 
public funds), 



32* 



b) emphasize the nature, structure, and processes of 
science, 

c) integrate laboratory activities in daily class 
routine, and 

d) emphasize higher cognitive skills and appreciation 
of science. 

Traditional curricula are defined as those courses or programs which: 

a) were developed or patterned after a program developed 
prior to 1955, 

b) emphasize knowledge of scientific facts, laws, theories, 
and applications, and 

c) use laboratory activities as verification exercises or 
as secondary applications of concepts previously 
covervd in class* 

In applying the above criteria to research studies reviewed, the 
identification of new curricula was much more clear cut than the 
identification of the traditional courses due to the lack of detailed 
information supplied in the studies. Similarly, it was difficult to 
establish the level of treatment fidelity in most studies for both 
the new curricula and the traditional; new curricula may have been 
used in traditional ways in some cases and vice versa. Where infor- 
mation about such anomalies was available, such information was coded 
and analyzed separately. 

In addressing the overall question of the effectiveness of new 
science curricula developed since 1955, the data in this report are 
organized in three broad cdte^orion: curriculum characteristic.'*, 



33 



student or teacher characteristics, and study or design characteristic 
The variable analyzed in all cases is the effect size (labeled E.S. 
or A in the remainder of this report). The effect size is a common 
metric derived from the* criterion variable data reported in the indi- 
vidual studies included in the report. > Representing both a magnitude 
and direction of group differences, the effect size metric facilitates 
a quantitative synthesis of individual studies in which student 
performance in new science programs and traditional science courses 
are compared. 

In this meta-analysis, effect sizes were calculated for one or 
more c,f the 18 a priori discrete criterion variables selected for 
analysis. Calculated effect sizes were analyzed for each, of the 
* eighteen criteria separately and in clusters of related criteria. 
The individual criteria (lettered) and the criterion clusters 
(numbered) are as follows: 

1. Achievement Cluster 

a. Cognitive - low 

b. Cognitive - high 

c. Cognitive - mixed/general achievement 

4 

2. Perceptions Cluster 

d. Affective - attitude toward subject 

e. Affective - attitude toward science 

f . Affective - attitude toward procedure/methodology 

g. Self-concept 



34 



3. Process Skills Cluster 

h. Process skills 

i. Methods of science 

4. Analytic Skills Cluster 
j. Critical thinking 
k. Problem solving 

5. Related Skills Cluster 
1. Reading 
m. Mathematics 
n. Social studies 
o. Communications skills 

6. Miscellaneous 
p. Creativity 

q. Logical thinking* (Piagetian) 

r. Spatial relations (Piagetian) 
Using the effect sizes calculated from the eighteen individual 
criterion measures and the composite effect sizes calculated for the 
six criterion clusters as the dependent variables and the three broad 
factors (i.e., curriculum characteristics, student or teacher charac- 
teristics, and study or design characteristics) as the independent 
variables, a series of specific questions dealing with the effect of 
new science curricula on student performance were generated and 
analyzed. The individual criteria and criterion cluster effect size 
measures were analyzed by specific curriculum (e.g.* PSSC, CBA, ESS), 
by curriculum type (e.g., physical science, life science, esrth, 
science), by grade level, community type, student gender, student race, 



3D 



student socio-economic status, teacher training, teacher characteris- 
tics, length of study, validity of study, curriculum profile, method 
of . testing, and form of publication* The "Results 11 section of this 
report provides a complete description of each question analyzed along 
with the appropriate statistical summaries. 

BACKGROUND 

Science courses have been a part of the school curriculum for 
well over 200 years. During this period of time educational philoso- 
phies, fanctions, purposes, goals, and objectives have changed drama- 
tically. Similarly, the role of science in education has changed. In 
assessing the impact of the new science curricula developed during 
the past 25 years, it seems appropriate to reflect upon some of the 
historical events leading up to the curriculum development era imme- 
diately following the launching of Sputnik. 

To begin, what is referred to as "traditional" science courses 
actually are courses or textbooks written in the post-World War II 
era (1945-1956). Immediately following the war, the science curriculum 
lacked articulation and coordination. General science was considered 
a junior high subject, biology was typically required in 10th grade, 
and chemistry and physics were offered as 11th and 12th grade electives 
&nd viewed as college preparatory courses. By 1950, some additional 
courses such as applied science, physiology, electricity, earth 
science, and physical science were offered. 

During the post-World War II period, less emphasis was placed on 
the memorization of information and more emphasis was placed on the 



9 

ERIC 



3y 



36 



functional aspects of science. Although information acquisition was 
still considered the most important goal in education, the under- 
standing of scientific principles and the development of problem 
solving skills were also stressed. The laboratory gained new accep- 
tance and importance during this period as well (Collette, 1973). 

The latter stages of this period saw several changes in the 
science curriculum: curricula were developed for gifted science 
students; new courses in earth science were developed; attempts were 
made to correlate science with other curricular areas; and, for the 
first time, attention was given to elementary school science. Competi 
tion in the textbook industry intensified resulting in improved and 
updated textbooks and laboratory manuals in all science areas. Manu- 
facturers of scientific materials and equipment also began making 
serious efforts to improve classroom products (Lacey, 1966; Richardson 
196H; Thurber and Collette, 1968). 

By the mid-1950' s, scientists and educators were becoming increas 
ingly concerned over the decreasing percentage of high school students 
enrolled in science courses ~ especially in physics. Colleges were 
also beginning to express concern about the quality of the student's 
high school science preparation (Novak, 1969; Washton, 1967). The 
rapid scientific and technological advances of this period began to 
pose a serious societal' and educational problem. An understanding of 
science and technology was becoming imperative. As Hurd (1961) noted, 
the nature of this education had not yet evolved. 

The National Science Foundation (NSF), conceived in the 1940 f s 
and born in 1950, was ready to go to work when the nation became 



ERiC 3:y 



37 



concerned about the scientific capability and the status of science 
education in the United States. NSF began locating brilliant investi- 
gators and got them to work doing imaginative fundamental research. 
As NSF began to establish itself as the primary supporter of basic 
research, they also began to get involved with public education. The 
initial educational efforts were conservative in that NSF provided 
graduate fellowships to the brightest young scientists in order to 
attract them into becoming research scientists . Within a short period 
of time, however, NSF realized that if a dramatic growth in the scien- 
tific and technological workforce was to be accomplished without 
reducing quality, then the entire talent pool from which scientists 
are drawn had to be greatly enlarged. NSF began to support the efforts 
of outstanding university scientists, educators, and learning theorists 
in an effort to develop science courses new in conception, design, and 
content and to educate teachers. Many scientists turned their atten- 
tion from the laboratory to the classroom and became actively involved 
in the curriculum reform movement. What followed was what many have 
come to regard as the Golden Age of Science Education (Rutherford, 
1980). 

One of the first tasks in order to initiate the reform was to 
examine the existing courses of study in science. Upon this examina- 
tion of the science textbooks of the 1950 's, it was evident that 
sporadic attempts had been made in order to keep texts up-to-date by 
adding bits and pieces to already existing content. The major problem, 
however, was that traditional topics were never deleted. Science 
textbooks, in general, contained a mass of often unrelated information 



40 



* ■ 38 

-much-of- whdclv was incorrect,- outdated, and .irrelevant—to .modern... . 
science • The conclusion was that existing courses were not able to be 
salvaged and that new courses of study in line with modern science and 
modem learning theory would have' to be developed (Collette, 1973 )• 

The curriculum reform movement had a gradual beginning with the 
formal organization of the Physical Science Study Committee (PSSC) 
late in 1956. This committee was the result of the 1954 recommenda- 
tion of the Division of Physical Science of the National Academy of 
Science that professional physicists work with high school" and college 
instructors in order to develop new physics courses and materials. 
The plan was to bring about "immediate" change. 

The result was that some of the most innovative and spectacular 
changes ever to occur in American public school education took place 
in the area of science (Collette, 1973). The public became very 
science and technology conscious. Along with the increased public 
support came increased financial support from federal agencies and pri- 
vate foundations. From 1956-1967 the NSF contribution to curriculum 

reform project^ at all levels exceeded $100,000,000 (Welch, 1968). 

J 

NSF also substantially increased the number of programs to improve the 
science backgrounds of teachers. Colleges and universities estab- 
lished institute programs which offered courses in science and mathe- 
matics in order to update teachers. Whereas in 1953, there were only 
two NSF summer institutes in science and !Aathematics, in 1963 there 
were 412 such institutes with about 21,000 teachers receiving instruc- 
tion (Science Policy Research Division Report, 1975). 



39 



§5f. l?29jLjft e ? J*. L_§P4 ? half of curriculum.jdeveiopment and 



implementation, the United States had apparently established a preemi- 
nence in science education to match its status in basic scientific 
research (Rutherford, 1980). The hundreds of millions of dollars 
spent on curriculum development and implementation generally was felt 
to be a good investment (Conant, 1976; Schlessinger and Helgeson, 1969; 
Welch, 1968). Unfortunately though, many people felt that the job had 
been accomplished, and thus the nationally funded curriculum efforts 
began to slowdown rapidly. A small cadre of science educators claimed 
that only part of the job had been completed and urged NSF to continue 
its work in the area of science education (Rutherford, 1980). 

During the period of curriculum development, implementation, and 
in-service institutes numerous evaluation studies were completed to 
assess the impact of these innovative programs on student performance. 
The most typical assessment was a comparative study measuring one or 
more student outcome variables with one of the new curricula as a 
treatment group and a traditional science course as a control' group. 
By the mid 1970 f s, however, curriculum assessment and evaluation 
efforts began to taper off without any real conclusive evidence that 
the Golden Age of Science Education had produced any substantial gains 
besides updating the subject matter. 

During the 1975-76 academic year teacher education activities were 
suspended. In 1976, NSF responded to Congressional pressure and 
awarded contracts to assess the current status of science education at 
the elementary and secondary levels (Butts, et at. , 1980; Yager, 1981a). 



40 



NSF funded a Status Study- of three. maj.oE. independent .but- related- - 
studies f> be conducted in parallel (Rutherford, 1980). Each study was 
designed from a different perspective to assess the status of science 
education in the United States (Helgeson, Blosser, and Howe, 1978; 
Stake and Easley, 1978; Weiss, 1978). 

The focus of the Helgeson, Blosser and Howe (1978) Status Study, 
conducted at.the-CenTeF for Science and Mathematics Education, The 
Ohio State University, was to report on the impact of activity in 
curriculum development, teacher education, instruction, and needs in 
science'' education. Specifically, the purpose of their study was to: 

1. review, analyze and summarize the appropriate 
literature related to pre-college science instruc- 
tion, to science teacher education, and to needs 
assessment efforts'; and 

2. identify trends and patterns in the preparation of 
science teachers, teaching practices, curriculum 
materials, and needs assessments in science educa- 
tion >during the period, 1955-1975. (Helgeson, 
Blosser, and Howe, 1978, p. 1) 

Their report is divided into five major sections., One section deals 
with existing practices and procedures in schools, another summarises 
science teacher education, the following section deals with controlling 
and financing education, the next reports on needs assessment efforts, 
and the final section presents a summary and trends of needs and 
practices* 

The second Status Study was organized by a team of researchers at 
the University of Illinois and was co-directed by Stake and Easley 
(1978)'. Case Studies in Science Education is a collection of field 
observations of science teaching and learning in American public 



EMC 4 J 



41 



spools during^ the „s chool ^e^r .1916 - 7.7,^ . The study .was. under-taken, to. 
provide NSF with a portrayal of the current conditions in K-12 science 
classrooms to help make NSF's programs of support for science education 
consistent with national goals and needs. Eleven high schools and 
their feeder schools were selected to provide a diverse and balanced 
group of sites. Field researchers were on-site from 4 to 15 weeks and 
were instructed to find out what was happening and what was felt to be 
important in science (including raethematics and social science) pro- 
grams. Each observer prepared an in-depth case study report which was 
presented intact as part of a final collection and later augmented with 
cross-site conclusions by the Illinois team. 

The third Status Study was directed by Weiss (1978) of the Research 
Triangle Institute. The purpose was to design and implement a national 
survey to answer the following questions: 

1. What science courses are currently offered in schools?* 

2. What local and state guidelines exist for the specifica- 
tion *>f minimal science experiences for students? 

3. What texts, laboratory manuals, curriculum kits, modules, 
eL~., are being used in science classrooms? 

4. What share of the market is held by specific textbooks 
at the various grade levels and subject areas? 

5. What regional patterns of curriculum usage are evident? 
What patterns exist with respect to urban, suburban, 
rural, and other geographic variables? 

6. What "hands-on" materials, such as laboratory or 
activity centered materials, are being used? What is 
the extent and frequency of their use by grade level 
and subject matter? 



* The National Science Foundation defines science to include the 
natural sciences, social sciences, and mathematics. 



ERIC \ 44 



42 



7. _ What : audio- visi^l materials (films, filn^trips/loops, 

models) are used? What is the extent, frequency and 
nature of their use by grade level and subject area? 

8. By grade level, how much time (in comparison with 
other subjects) is spent on teaching science? 

9. What is the role of the science teacher in working 
with students? How has this role changed in the past 
15 years? What commonalities exist in the teaching 
styles/strategies/practices of s.cience„±eachers- 
throughout the United States, 

10. What are the roles of science supervisory specialists 
at the local district and state levels? How are they 
selected? What are their qualifications? 

11. How have science teachers throughout the United States 
been influenced in their use of materials by Federally- 
supported in-service training efforts in science? 
(Weiss, 1978, p. 1) 

This survey utilized a national probability sample of districts, 

schools, and teachers. The sample was designed so that national 

estima^s of curriculum usage, course offerings and enrollments, and 

classroom practices could be made from the sample data. The sample 

included superintendents, supervisors, principals, teachers, and other 

school personnel. 

The Office of Education (0E) also funded a project to assess the 

status of science education. The third assessment of science as part 

of the National Assessment of Educational Progress (1978) provides 

information regarding, the results of science instruction in the United 

States. This report is a comprehensive assessment of science knowledge, 

skills, attitudes, and educational experiences of precollege students 

(Kahl and Harms, 1981). The third assessment included a new battery of 

items which provided information regarding affective outcomes of 



43 



-science-education- -foP-nine-,-*h-i-rteen- T and- seventeen-year-olds , - as " ~ 
well as for an adult sample (Yager, in press). 

In 1978, NSF funded a project to synthesize and to interpret the 
information from the three 'K-12 Status Study reports and the NAEP 
assessment. This research effort, called "Project Synthesis", 
examined K-12 science education from five perspectives (biology, physi- 
cal science, inquiry, elementary school science, and science/technology 
and society) within four goal clusters and critical elements for 
teaching (e.g., instructional" procedures, teacher characteristics, 
instructional facilities and materials, and others) (Yager, 1981a). 

In an attempt to increase the scope of the three K-12 Status Study 
reports, NSF (1980) selected nine professional organizations with 
different responsibilities and perspectives to analyze the studies 
independently and submit reports. The organizations selected were: 

Teacher Organizations 

1. National Council for the Social Studies 
' 2. National Council of Teachers of Mathematics 
3. National Science Teachers Association 

Science Organizations 

1. American Association for the Advancement of Science 

2. National Academy of Science 

Administration and Support Organizations 

1. American Association of School Administrators 

2. Association for Supervision and Curriculum 
Development 

3. National Congress of Parents and Teachers 

4. National School Boards Association 
(Rutherford, 1980) 

The reports of these organizations provide an interesting and informa- 
tive view regarding the totality of science education in American 
schools (Rutherford, 3 980) and are available in the NSF document 



0 4tf 



1 



44 



-entitled, ''What -Are the Needs- in- Precol-lege Science-, -Mathematics, and 
Social Studies Education? Views from the Field."* 

Finally, in the spring of 1979, NSF funded a Status Study of 
Graduate Science Education in the United States, 1960-1980 (Yager, 
1980b). The purpose of this project was to consider the current status 
of science education at graduate institutions. This study was viewed 
as an extension of 'the three Status Studies for K-12 science education 
(Helgeson, Blosser, and Howe, 1978; Stake and Easley, 1978; Weiss, 
1978) and as a logical next step to consider the unique features of 
the discipline of science education as perceived by science educators 
from institutions throughout the United States. Funds from this 
project also allowed a summer writing group to assemble at The Univer- 
sity of Iowa. A paper entitled "Crisis in Science Education" resulted 
from this effort (Yager, 1980a). 

The three K-12 Status Study .reports (Helgeson, Blosser, and Howe, 
1978; Stake and Easley, 1978; Weiss, 1978), the professional reviews 
of the Status Study reports (NSF, 1980), and the reports proclaiming 
a crisis in science education (Yager, 1980a, 1980b) all provide an 
interesting assessment of where science education has been, where 
science education is today, and where science education should be 
headed. But the hard truth is that none of the reports has stimulated 
the interest of public or private groups to the extent that the groups 
conducting the studies had originally hoped. The qualitative nature 
of these assessments may explain the diminished impact of the results. 



9 

ERIC 



4V 



4b 



.-Celtics tend_ to question .-the -overall -validity~-of qualitative analyses 
where problems with investigator bias are difficult to control. 

Quantitative synthesis techniques considerably reduce the poten- 
tial for investigator bias. A meta-analysis of research focusing on j:he 
various criterion variables, criterion clusters, and criterion clusters 
by study variables provides a comprehensive assessment of the effects 
of the new curricula on student cognitive and affective achievement. 
Such a comprehensive assessment should establish specific and firm 
conclusions of value to practitioners. 

Since curriculum revision and evaluation is a continuing process, 
the conclusions of this meta-analysis are important to researchers 
assessing curriculum development and implementation for several rea- 
sons: (1) those areas where questions of interest have already been 
adequately answered will be identified; (2) those areas where the 
research results are inconclusive or are not worthy of further investi- 
gation will be identified; (3) those questions which have not yet been 
adequately explored will be revealed. This knowledge should result in 
fewer research projects being devoted to duplication of research which 
does not appear to be necessary, fewer research projects being devoted 
to unimportant questions or issues, and more research projects being 
directed to the major questions which are yet unanswered. Such a 
synthesis is long overdue. 

Finally, the results of this study have potential significance 
for groups which establish public and educational policies, as well as 
groups which implement these policies. A comprehensive assessment of 



48 



46 



-B^L^f^^AX® 1 ! 1 ? 8 ^ of science curricula developed since 1955 should 
provide valuable information for future development and reseach activi- 
ties. 

PROCEDURES 

This study was designed to investigate the impact and effects of 
the new curricular programs developed for elementary, junior high, and 
secondary science education since 1955. The meta-analysis perspective - 
of research integration, developed by Glass (1976), is utilized to 
record quantitatively the properties and findings of studies which 
measured and compared student performance in a new science curricula 
with student performance in a traditional course. Only studies 
involving United States samples are included in this meta-analysis. 
This groundrule was established since the curricula studied were origi- 
nally designed for use in American schools and generally modifications 
are made when these curricula are adopted for use internationally* 

This section includes a description of: the research methods 
involved in conducting this meta-analysis, the studies included in this 
meta-analysis, the coding variables and coding reliability, procedures 
regarding effect size calculations, and methods of data analysis. 

Description of the Search Methods 
The first task in conducting a meta-analysis is to locate and 
obtain the relevant research studies in the field of interest. The 
first step in this project was to collect and examine a representative 
sample of science education research studies in order to map out the 
research literature to be me ta- analyzed. Literature was sampled across 



47 s 



time- and type of publication from the following sources : Dissertation - 
Abstracts International , The Journal of Research in Science Teaching , 
Science Education , and the most recent abstracts of presentations for 
the National Association for Research in Science Teaching annual con- 
vention. Literature searches were then conducted on a sampling basis 
to obtain an estimate of the size of the relevant literature • These 
searches were conducted using Dissertation Abstracts International , 
SRIC, Social Science Research , and Science Education: A Dissertation 
Bibliography (1978). 

Arrangements were made with the ERIC Center to borrow the large 
number of dissertation microfilms which were identified. It was deter- 
mined that the sequence of searching and the subsequent coding of 
documents would be as follows: dissertations, research documents and 
reports available from ERIC or on microfiche, published journal 
articles, and other documents identified during the coding process. 
The rationale for the above order was the desirability of beginning 
with primary and most comprehensive sources, as well as to avoid any' 
duplication of data in situations where researchers later reported all 
or part of their research studies in professional journals. The final 
stage of the search procedure was to review the following journals for 
relevant studies reported from 1955 to 1980 which were not previously 
coded: American Biology Teacher , High School Journal , Journal of 
Chemical Education , Journal of Research in Science Teaching , Journal 
of Secondary Education , The Physics Teacher , Science Education , The 
Science Teacher , and School Science and Mathematics, 



48 



---------- Studies Included 

Three hundred two studies were examined for this meta-analysis. 
One hundred five of those studies contained sufficient data for the 
meta-analysis. Studies included in the meta-analysis had to satisfy 
the following criteria: 

1. Studies had to be conducted at the elementary, junior 
high, or secondary level between 1955 and 1980. 
College level studi'es were included if the curricula 
we r e not modified and if the students had no, prior 
course in that science discipline. 

2. Studies had to be conducted in the United States using 
United States samples. Thus, comparative studies 
between United States samples and international samples 
were not included. 

3. Studies had to be an experimental investigation comparing 
student performance in a new science curricula to student 
performance in a traditional course (e.g., ESS versus 
traditional, ESCP versus traditional, BSCS versus tradi- 
tional). Descriptive or theoretical studies are not 
included in this meta-analysis, nor are studies which 
only reported student performance on variables for 
which there was no control group, 

Three hundred forty-one effect sizes (A f s) were calculated from 
the studies included in the meta-analysis. Table 1 shows the distri- 
bution of effect sizes by source of study. These studies represent a 

erJc 5jl 



49 

8 



TABLE 1 

DISTRIBUTION OF EFFECT SIZES ( A's) 
BY SOURCE OF STUDY 



Effect Sizes 
per otuay 






SOU 


R C E 






Dissertations 
Studies A's 


ERIC Documents 


Journal 


Articles 


O I UUJLcb 




Studies 




i 


20 


20 


2 


1 2 


6 


6 


2 


20 1 


40 


4 


1 8 


7 


1 14 


3 


9 


27 


1 


3 


3 


9 


H 


6 ' 


24 


0 




5 


1 20 


5 


^ i 


20 


1 


1 5 


0 


1 0 


6 


6 


36 


0 


0 


1 


6 


7 

-* 




14 


1 


1 7 


0 


1 0 


8 


i 


Xu * 


0 


1 0 


1 


1 8 


9 


0 


0 


0 


0 


0 


0 


10 


1 | 


10 


1 


1 10 


0 


1 o 


11 


0 . 


0 


0 


1 0 


0 


1 ° 


12 


0 


0 


0 


0 


0 


0 


13 


0 j 


0 


0 


0 


0 


1 0 


m- 


0 


0 


0 


0 


0 


0 


15 


o 1 


0 


0 


' 0 


0 


■ 0 


16 


1 1 


16 


0 


0 


0 


1 0 


17 


0 


0 


0 


0 


0 


0 *' 


18 


o 1 


0 


0 


0 


0 


1 0 


19 


0 | 


0 


0 


0 


0 


1 0 


20 


1 


20 


0 


0 


0 


0 


TOTAL 


72 * 


243 


10 


35 


23 


' 63 



r i 



total sample size of 45,626 students. Table 2 shows the distribution 
of the student sample by source of study. 



TABLE 2 

DISTRIBUTION OF STUDENT SAMPLE 
BY SOURCE OF STUDY 





Treatment 


Control 


Total 


Dissertations (N = 72) 


13,987 


14,569 


28,556 


ERIC Documents (N = 10) 


4,145 


3,462 


7,607 


Journal Articles (N = 23) 


4,645 


4,818 


9,463 



Coding Variables 

There are numerous study characteristics which can influence the 
effectiveness of treatments in comparative studies. A critical part 
of this meta-analysis involved identifying and coding factors related 
to studies. In order to make full use of statistical methods in the 
meta-analysis, various features of each study were measured or other- 
wise expressed in, quantitative terms. Many of these features are 
expressed in familiar scales (e.g., date of publication, length of 
study in weeks, IQ of students , 5 grade level) while other features are 
norordinal characteristics which are coded by indicator variables 
(e.g., form of publication, secondary school background, curriculum 
profile, rated internal validity, the specific characteristics of the 
treatment). The coding form utilized in this study was developed by 



ERIC 



r 



01 

0 



the investigating team during a week-long meeting and meta-analysis 
training session in Boulder, Colorado. (Refer to Appendix A for the f 
complete coding form. ) The coding form is subdivided into the follow- 
ing categories: background and coding information, sample characteris- 
tics, treatment characteristics, teacher characteristics, and effect 
size calculation. Each of these categories is discussed below. 

Each study was read and a coding form was completed for each out- 
come and each comparison in the study. A list of coding conventions 
was developed during the week-long training session. These were used 
to guide the classification of studies whose characteristics were 
ambiguous. ' These conventions are also explained below. 

Background and Coding Information 
The numeric coding of each study extended across two computer 
cards - 176 digits of coding in all. The reader IDft identified the 
number of the card in the data record of two cards (i.e., ID#: l or 2). 
Each study was identified by a reader code and a study code. The 
reader code identified the project site and the researcher at that 
site who coded the study. 

The comparison code refers to the number of different treatments 
compared to a control group within a study. A comparison code of 01 01 
would indicate one comparison within the study while a comparison code 
of 02 03 would indicate the second comparison or treatment group of a 
total of three treatments (e.g., ESS, SCIS and S-APA ax2 compared to 
a control group). 



ERIC 



5-, 



52 

« < 



The outcome code refers to the number of dependent outcome K 
variables assessed in the study. The coding system of outcomes for a 
study is the same' as comparisons within a study. Thus, an outcome 
code of 01 03 refers to the first identified variable^coded of a total 
of three variables coded (e.g., a single study may have assessed cog- 
nitive factors affective factors, and critical thinking). 

The date of publication was recorded as staged on the coded manu- 
script. In some cases studies were published more than once. In 
these cases the most complete source was coded, i/ the manuscripts 
were similar, then the earliest date of publication was recorded. 

The form of publication was classified ^according to the form in 
which the coded study appeared: journal article, book, HA /MS thesis, 
dissertation, or unpublished manuscript. The most complete source of 
data was recorded. Thus, if a dissertation was later published in a 
journal, the dissertation was coded. 

Sample Characteristics 

A number of variables were coded which were specifically related 
to the student sample of each study included in the meta-analysis. 

The grade level of the students was coded and classified into five 
categories: primary (K-3), intermediate (H-6), junior high (7-9), 
senior high (10-12), and post secondary, The post secondary classifi- 
cation was included for any studies which might have used one of the 
newly developed curricula a v.ie community college or college level. 

The total sample size represents the total number of students in 
the treatment and control groups. ■ 



ERIC £ 



0,j 



53 

\ 

The length of the study was coded in weeks indicating the duration 
of the treatment. Sequential studies were coded up to a duration of 
three years. All sequential studies longer than three years* duration 
were categorized together. 

Gender was coded as the percentage of female students in each 
study. For studies which did not state the percentage of males and 
females this figure was inferred. For elementary, junior high, and 
required secondary science courses the percentage of female students 
in the study was inferred to be 50%. For chemistry courses the per- 
centage of female students was t coded as 25%. For physics courses the 
percentage of female students was inferred to be in the 10-15% range 
depending upon the total sample size. This range was used for rounding 
purposes since physics studies generally had fewer subjects. 

The average ability of the students was recorded on the basis of 
low (below 95 IQ), average (95-105 IQ), or high (above 105 IQ)* The 
homogeneity of the IQ was recorded as homogeneous or heterogenous, as 
well as the source of IQ (i.e., whether it was stated within the study 
or inferred). If the average ability of the students in the sample 
was inferred, it was recorded as being heterogeneous average IQ if the 
sample was an elementary, junior high, or required secondary science 
course. If the samp.\3 was a chemistry course the average ability was 
codec} as high ability heterogeneous. If the sample was a physics 
course the average ability was inferred to be high ability homogeneous. 

The race of the sample and the predominant minority was coded if 



the information was provided in the study, Race was recorded as the 
percentage of non-white students. The predominant minority categories 

5b 



\ 



54 



were: Mexican, non-Mexican Hispanic, Oriental, American Indian, Black, 
or other. The percentage of predominant minority was also recorded if 
that information was provided. 

The eooio-eoonomic status of the sample was coded as low, msdium, 
or high. The- homogeneity of the socio-economic status was also 
recorded (i.e., homogeneous or heterogeneous). In some instances this 
information was inferred from the geographic location of the study 
site. 

The secondary school science background was coded for each study. 
The following courses were coded either as "yes" or "no" regarding the 
secondary student's prior science: life science (typically a 7th =grade 
course), physical science (typically an 8th grade course), general 
science or earth science (typically 9th grade courses), biology (typi- 
cally a 10th grade elective), chemistry (typically an 11th grade elec- 
tive), and physics (typically a 12th grade elective). If the students' 
science background was not stated in the study, it was inferred that 
they had taken all courses prior to the science course they were 
currently enrolled in with the exception of earth science. 

The handicapped variable was used to code any studies in which 
the student sample involved any of the following physical or emotional 
handicaps: visually impaired, hearing impaired, learning disability, 
emotionally disturbed, multiple handicaps, or educable mentally 
retarded. 

The sample size of students in both the treatment and control 
groups was recorded (N of pupils in T. and N of pupils in T 0 ). The 



ERIC 



5/ 



55 



% mortality of T % and 2\> was also recorded if that figure was reported 
in the study. 

The special grouping by ability variable was used to code whether 
students were grouped into a low, medium, or high track; or, whether 
students were not grouped by ability. 

The size of the school involved in the study was coded as stated. 
The following criteria were used: less than 50 students, 50-199 stu- 
dents, 200-499 students, 500-999 students, 1000-1999 students, greater 
than 2000 students. 

The type of community was also coded as stated in the study, or 
inferred on the basis of geographic location of the study site, as 
follows: rural, suburban, or urban. 

Treatment Characteristics 
The treatment code variable refers to the elementary, junior high, 
or secondary science curricula which was used as the treatment course 
in each study coded. The majority of these curricula -were identified 
prior to any coding. A few curricula, however, were added to the 
coding list after the coding process began. A complete list of cur- 
ricula treatment groups follows: 

Elementary Science 

Elementary Science Study (ESS) 

Science Curriculum Improvement Study (SCIS; or, SCIIS, SCISII) 

Science - A Process Approach (S-APA) 

Outdoor Biology Instructional Strategies (OBIS) 

Elementary Science Learning by Investigation (ESLI) 

ESSENCE 

Conceptionally Oriented Program for Elementary Science (COPES) 
Modular Activities Program in Science (MAPS) 

Unified Science and Mathematics for Elementary Schools (USMES) 




5o 



56 



Minnesota Mathematics and Science Teaching Project (MINNEMAST)^ 
Individualized Science (ISO 

Science Curriculum for Individualized Learning (SCIL) 
Elementary School Training Program in Scientific Inquiry 

(University of Illinois) (ESTPSI) 
Flint Hills Elementary Science Project (Kansas State 

Teachers College) (FHESP) 

Junior frigh Scijence 

Human Science Program (HSP) 
Time, Space and Matter (T3M) 

Individualized Science Instructional System (ISIS) 
Intermediate Science Curriculum Study (ISCS) 
Introductory Physical Science (IPS) 
Earth Science Curriculum Project (ESCP) 
Interaction of Matter and Energy (IME) 

Conservation Education/Environmental Educatjpn/Ecology (CE/EE) 
Montclair Science Project (MSP) 

Secondary Science 

Biological Science Curriculum Study (BSCS) 

Special Materials (BSCS/SM) 

Yellow Version (BSCS/Y) 

Blue Version (BSCS/B) 

Green Version (BSCS/G) . 

Advanced Materials (BSCS/A) 
Chemical Education Materials Study (CHEM Study) 
Chemical Bond Approach (CBA) 
Physical Science Study Committee (PSSC) 
Harvard Project Physics (HPP) 

Conservation Education/Environmental Education/Ecology (CE/EE) 
Physical Science for Nonscience* Students (PSNS) 
Interdisciplinary Approaches to Chemistry (IAC) 

A curriculum profile was established for the major elementary, 
junior high, and secondary science curricula. The profile assessed 
each curriculum on five parameters: (1) degree of inquiry, (2) empha- 
sis on process skills, (3) emphasis on the laboratory and/or laboratory 
skills, (4) degree of individualization, and (5) emphasis on content. 
Each parameter was ranked from low (1) to high (4). The scores for 
each curricula represent an average score based on assessments of five 
science educators familiar with each of the programs. The curriculum 



r 



ERIC *y 



57 



profile of major curricula was developed for two purposes: - (l) to be 
able to record any modifications made within the context of each 
individual study regarding any of the parameters, and (2) to make com- 
parisons between curricula. The study modification to curriculum 
profile variable indicates whether modifications were made toward the 
low end of each curriculum profile category, toward the high end of 
each curriculum profile category, or whether there were no modifica- 
tions made. See Table 3 for curriculum profile data. 

The technology used variable indicates whether hand held calcula- 
tors, films, television, or computers were used or not within the 
study. 

Teacher Characteristics 
For studies which stated the ratio of male to female teachers 
involved in the experiment, the percentage of female teachers was 
recorded. If reported, the average number of years of science teaching 
experience was coded; as well as, the average number of years teaching 
science curriculum T Jt and the average number of years teaching science 

curriculum 5f„. 

2 - 

The race of the teachers involved in the study and the predominant 
minority was coded if the information was provided. The predominant 
minority categories were: Mexican, non-Mexican Hispanic, Oriental, 
American Indian, Black, and other. The percentage of predominant 
minority was also recorded if that information was provided. 

The average educational background for teachers involved in each 
study was coded as follows: less than a Bachelors degree, Bachelors 



Co 



58 



TABLE 3 
CURRICULUM PROFILE 





Inquiry 


Process 
Skills 


Emphasis 
on Lab 


Degree of 
Individu- 
alization 


Emphasis 

on 
Content 


Elementary 
Curricula 












ESS 


4 


3 


4 


4 


1 


SCIS 


3 


3 


3 


3 


2 


S-APA 


2 


4 


3 


2 


-5 
o 


OBIS 


3 


2 


3 


2 


2 


ESLI 


2 


2 


2 


2 


2 


ESSENCE 


1 


1 


1 


4 


1 


COPES 


2 


3 


2 


2 


3 


HAPS 


2 


3 


3 


2 


3 


USMES 


3 


3 


3 


2 


1 


MINNEMAST 


2 


2 


3 


2 


3 


Junior High 












Curricula 














2 


2 


2 


2 


3 


ISIS 


3 


4 


3 


3 


2 


ISCS 


2 


2 


4 


3 


4 

• 


IPS 


2 


3 


4 


2 


2 


ESCP 


2 


2 


3 


2 


4 


IMP 


2 


2 


3 


2 


3 


Secondary 
Curricula 












BSCS (Special 
Materials ) 


3 


3 


4 


4 


3 


BSCS Yellow 


2 


3 


3 


2 


3 


BSCS Blue 


2 


3 


2 


2 

(continued 


4 

) 



59 



TABLE 3 (Continued) 





Inouirv 


Process 
Skills 


Emphasis 
on Lab 


Degree of 
Individu- 
alization 


Emphasis 
on 
Content 


Secondary 
Curricula 
(continued) 












BSCS Green 


3 


3 


3 


2 


3 


BSCS Advanced 


3 


3 


4 


4 


3 


CHEM Study 


2 


3 


3 


2 


3 


CBA 


1 


2 


2 


1 


4 


PSSC 


1 


3 


3 


2 


4 


Project 
Physics 


2 


3 


3 


3 


3 



degree, Bachelors plus 15 hours, Masters degree, Masters plus 15 hours, 
Masters plus 30 hours, Doctorate degree. 

The remaining coding variables in this section deal with teacher 
training: was preservioe twining provided?; and, was inservioe 
training provided?. The financial funding of inservice training was 
coded if such information was provided: locally funded and/or spon- 
sored; university funded and/or sponsored; federally funded. 

Design Characteristics 
A characteristic often considered important in judging the quality 
of a comparative study is how the experimenter allocated subjects to 
the treatment and control groups. The assignment of students to groups 
variable represents whether students were randomly assigned to groups, 



60 



selected in matched pairs, part of intact groups, or volunteered to be 
a part of the experiment (self -selecting). The assignment of teachers 
to groups variable was coded for random assignments, non-random assign- 
ments, self-selecting assignments, or for situations where teachers 
taught both .groups (crossec- or were matched on certain measures. 

The unit of analysis variable coded whether individual students, 
a classroom of students, an entire school, or some other group of stu- 
dents was used as the primary unit of analysis in the study. 

The type of study was coded according to Campbell and Stanley 
(1963) definitions as: correlational, quasi -experimental, experimen- 
tal, or pre-'experimental. 

The rated internal validity was judged on the basis of the assign- 
ment of subjects to groups and the extent of subject mortality in the 
study. Low internal- validity studies were those whose matching proce- 
dures were weak or nonexistent, or where intact convenience samples 
were used. The study was also rated low if mortality was exceptionally 
high or severely disproportionate. Medium internal validity ratings 
were assigned according to the following criteria: (1) studies with 
randomization but high or differential mortality, or (2) studies with 
"failed" randomization procedures (e.g., where the experimenter began 
by randomizing, but then resorted to other allocation methods) and low 
mortality, or (3) studies with intact groups but highly similar and 
low mortality, or (4) extremely well- designed matching studies. To be 
judged high on the internal validity measure, a study must have used 
random assignment of subjects to groups and huvo low and fairly 
equivalent mortality rates. 



61 



Occassional!^ statistical or measurement irregularities decreased 
the level of internal validity (e.g., when an otherwise well-designed 
study employed different testing times for the treatment and control 
groups). It is also recognized that other factors such as sample size, 
congruence of the measures with the treatment or control groups, the 
method of measurement, or the reactivity of the measurement influence 
internal validity. These five constructs were assessed separately. 

Outcome Characteristics 
The content of measure variable identified the science discipline 
- 1 involved in the study: life science, physical science, general science, 

earth science, biology, chemistry, or physics. All elementary studies 
were coded as general science. 

The congruence of measure with and 2\> is a measure of test 
reactivity. Congruence was measured as low, medium, or high. For 
example, if a general achievement test designed specifically for PSSC 
was used to compare achievement of PSSC students versus non-PSSC stu- 
dents, the congruence for T (treatment group) was coded high and the 
• — * 

congruence for T 2 (control group) was coded low. 

The type of criterion refers to the twenty- two criterion variables 
identified for coding. The eighteen variables for which data were 
obtained were grouped into six criterion clusters for analysis. The 
six criterion clusters and the eighteen individual criterion variables 
are listed in the "problem statement 11 section. 

The criterion measured variable identifies whether the study 
assessed student performance or teacher performance. There were no 

ERIC K) * 



62 



studies included in this meta-analysis which assessed teacher perfor- 



mance. 



The method of measurement indicates whether the study measurement 
was: a standardized test; an ad hoc written test (e.g. , developed by 
researcher, curriculum project) ; observational (e.g., passive or 
instructional observations); or, a structural interview or assessment. 

The reactivity of the measurement refers to the level of researcher 
bias in the tests used. Standardized tests were considered to have low 
reactivity while experimenter-made tests were judged to have high 
reactivity. 

Effect Size Calculations 

The source of effect size data variable refers to whether the 
effect size was: calculated directly from reported data or raw data 
from the study (e.g., means and variances); reported with direct 
estimates (e.g., ANOVA, t-test, F-values); calculated directly from 
frequences reported on ordinate scales (Probit, X 2 ); calculated back- 
wards from variance of means with randomly assigned groups; calculated 
from nonparametric statistics (other than X 2 ); guessed from independent 
sources (e.g., test numbers, other students using the same test, con- 
ventional wisdom); estimated from variance of gain scores (correla- 
tional guessing); or, 1 derived from probability level only (i.e., 
conservative estimates). 

The source of means was coded as reported in each study. The 
following categories were used: unadjusted post-test; covariance 
adjusted; residual gains; pre-post-test differences; or, other. 



ERIC Bo 



o 



63 



The reported significance jf each study was coded as: p < .005; 
.005 < p < .01; .01< p < .05; .05 < p < .10; or, p > .10. 

The dependent variable units were coded if they were reported in 
grade-equivalent units or some other unit. The mean difference in 
grade-equivalent units was reported if the dependent variable was 
reported in grade-equivalent units. 

If the group variances were observed individually, then the ratio 
of experimental to control group variances was calculated, as well as 
the effect size based on experimental group variance (A), the effect 
size based on control group variance (B) , and an average effect size 
based on (A) and (B) . If the group variances were not observed indi- 
vidually, the study effect size was reported directly from the source 
of the effect size data. 

Reliability 

Once the coding variables were identified and ground rules were 
established, estimates of coder reliability were calculated. The 
reliability of a measurement "is the statement which represents the 
various sources of error in the repeated measurement of a single 
phenomenon or the consistency in which an individual performs the same 
. task over a period of time" (Brown and Webb, 1968, p. 37). An 
instrument itself is neither reliable nor unreliable — it is only 
when the instrument has been used to collect data that one can speak 
sensibly about reliability. 
V Based upon a random sampling of five studies read and coded inde- 

pendently by the two coders involved in* the study, a 94.8% coder 

EMC 6b 



364 



agreement was attained in coding the 76-80 study variables (studies 
which reported group variances individually contained 80 study 
variables while studies which did 0 not report group variances individ- 
ually had only 76 study variables). 

Procedures Regarding Effect Size Calculation 
The magnitude of the effect of a treatment is the most important 
variable in any outcome study. In this study, the effect of new 
curricula on student performance was assessed by measuring /the magni- 
tude and direction of change for twenty-two criterion variables. Meta 
analysis involves calculating a common metric for defined variables 
within a study. The common metric, measuring the magnitude of the 
effect, is referred to as an effect size (abbreviated E.S. and sym- 
bolized by the greek letter A). The effect size is a normalized 
measure of the performance difference of two groups on, a dependent 
variable (e.g., general achievement, critical thinking, self-concept). 
Effect size is defined as the mean difference between treatment condi- 
tions divided by within-group standard deviation (Glass, 1976). 



E.S. = 




where: X t = mean of treatment group; 

= mean of control group; and 
SD^ = standard deviation of control group. - 

Nearly all of the effect sizes calculated for this study used 
either the formula above or, in studies^which .reported^ values , the 



65 



F-value was considered equal to t 2 and the following formula was 
used: 



ERIC 



E.S. 



J " 3 n 2 



where: t = t- value; 

rij = sample size of treatment group; and 
«2 = sample size of control group. 

If only the total sample size (N) was reported, it was assumed that 
71 1 = n 2 since e( l ual «'s provide a more conservative estimate of the 
effect size than unequal n's. 

In a few instances, the only information reported in the study was 
that a particular test statistic (e.g., t or F or Fisher's Z - trans- 
formation of N) was calculated on n cases 'with a level of significance 
p. Provided that the p-value is reported exactly and nor rounded, the 
transformation is straight forward. If, for example, it is reported 
that a two group t-test witli n JL = = 6 is significant at the p = .02 
level (two-tailed test), then it is a simple matter of determining the 
corresponding t- value: 

.99 t 10 = 2 ' 75 * 
Knowing n^, and the value of the fr-test, the effect size can be 
calculated using the formula: 



66 



\ 

a = t jXTT" 

= 1,59. 

In studies reporting only an approximation of the p-value for a 
measured criterion variable, the conservative value of p was utilized 
in order to estimate the value of t. This yields a conservative 
effect size. 

The reader is referred to Glass, McGaw, White and Smith (1980, 
p. 136-197) for detailed derivations and illustrations of procedures 
for transforming other reported statistics and measurement scores into 
effect sizes. 

Methods of Data Analysis 4 ' 

During the coding phase of this meta-analysis a total of three 
hundred two studies were reviewed. One hundred five of those studies 
contained sufficient data for the meta-analysis. From these studies, 
which repr- ent a .total sample size of 45,626 students, three hundred 
forty-one effect sizes (E.S.) were calculated. The' coding form for 
, s this meta-analysis (Appendix A) provides information regarding the 
variables for which data were collected (i.e., background information, 
sample characteristics, treatment characteristics, teacher character- 
istics, design characteristics, outcome characteristics, and effect 
size calculations). 

erJc ' S'J 



67 



Thus, in response to the overall question assessing the effects 
of new curricular programs developed in science education since 1955 
summary statistics of effect sizes were calculated for: the 27 new 
science curricula for which data we^re collected; the 18 student per- 
formance measures; and the 6 criterion cluster variables. Sample 
characteristics such as grade 'level, community type, length of study, 
student gender, and socio-economic status; background information 
regarding form of study publication; treatment characteristics; content 
characteristics; internal validity; the curriculum profile character- 
istics; in-service training of treatment instructors data; and, method 
of measurement of the criterion variable data were also coded. The 
relationship between the six criterion cluster measures and each of 
the subsequent variables listed above was analyzed. 

Effect size summary statistics calculated for each of the 
variables listed above were: mean effect sizes, minimum A, maximum 
A, standard deviation, and t-values. Statistical analysis was accom- 
plished using the General Linear Model (GLM) procedure of the 
Statistical Analysis System (SAS) (Helwig and Council, 1979) on the 
IBM 370/168 at The University of Iowa (programs were run under release 
79. 4B). The model source statement of the GLM gives the dependent 
variables and independent effects. m Due to unequal cell frequencies 
orthogonality is destroyed. A condition for orthogonality is that 
the number of observations in each combination of treatments is equiva- 
lent (Hayes, 1973). Thus, the Type IV Sum of Squares (SS) is used as 
described in SAS-76 (Barr et al. y 1976). The corrected total reported 
for each analysis is equal to the number of effect sizes in the data 

ER?C :„ 



68 



set minus one (N-l). The Duncan's Multiple Range Test for specified 
effect size variables was also calculated and these data are reported 
where appropriate. 

All statistically significant data in this report are identified 
by an asterisk (*). Such values are significant at the a priori alpha 
level of 0.05. 

RESULTS AND DISCUSSION 
Curriculum Characteristics 



How do students exposed to various new science curricula 
compare to students exposed to traditional science 
curricula on a composite performance level? 

The literature search revealed 105 codable studies comparing new 
science curricula to traditional science programs in terms of one or 
more student performance criteria. The codable studies encompassed 
27 different science curricula and 18 distinct performance criteria. 
As an overall indicator of new curriculum effectiveness, effect size 
data extracted from all studies on a specific new science curricula 
were summarized. This summary of a composite performance analysis is 
presented, in Table 4 for the 27 new curricula included in this meta- 
analysis. Although the data in Table 4 do not provide information 
about the specific focus of the original research studies analyzed, 
the composite student performance data by new curricula do provide 
a starting place ~ a first approximation regarding the effectiveness 
of new science curricula. 



' i. 



69 



TABLE 4 

EFFECT SI2E DATA FOR COMPOSITE STUDENT 
PERFORMANCE MEASURES BY CURRICULUM . 



CURRICULUM 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


c 


. D. 


t-value 


Elementary 
















ESS 


11 


0.37 


0.01 


0.81 


0 


• 27 


4.56* 


SCIS 


45 


0 30 


— U . 0/. 


41 


0 


. 55 


3.64* 


SAPA 


45 


0, 27 


-0 S7 


O £o 


0 




3.26* 


USMES 


17 


0. 55 


u • ux 


h fi o 
4 . 48 


1 


.05 


2.17* 


MINNEMAST 


2 


1. 51 






1 


• 35 


1.57 


IS 


2 


0. 64 


n fin 


O CO 


0 


. 06 


14.33* 


SCIL 


16 


0.43 




1 TO 


u 


o o 
• oo 


4.44* 


: ESTPSI 


6 


0 

wi OS 




u. /o 


0 


• 28 


3.35* 


FHESP 


1 








0 


.00 




Junior High 
















HSP 


" 4 


O fifi 


U • HO 


0.85 


0 


► 18 


7.24* 


TSM 


1 


0.49 






u , 


uu 




ISCS 


6 


0.18 


-n in 


n 7ii 


r\ 

V 4 


31 


1.39 


IPS 


17 


0. 00 


-0 4U 


O LlU 


A 

u . 


o o 


0.04 


ESCP 


24 


0.16 


-0 .70 


0 ftfi 


V • 


OO 


2.11" 


IME 


4 


0.22 




j 

O fifi 
U . O 0 


U , 


4^ 


1. 02 


CE/EE 


1 


0.01 






- 0. 


00 




MSP 


3 


0.21 


0.11 


0.42 


0. 


17 


2.13 


Secondary 
















BSCS/SM 


4 


0.11 


0.01 


0.29 ' 


0. 


12 


1.72 


BSCS/Y 


28 


0.48 


-0.50 


1.78 


0. 


57 


4.36* 


BSCS/B 


6 


2.32 


0.44 


4.18 


1. 


41 


4.03* 


BSCS/G 


5 


0.13 


-0.18 


0.34 


0. 


21 


1.36 


BSCS/A 


4 


0.09 


-0.17 


0.43 


0. 


29 


0.62 



continued 



t 



70 



TABLE 4 (Continued) 



CURRICULUM 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S.D. 


t- value 


CHEM 


33 


0.12 


-0.49 


0.92 


0.37 


1.84 


CBA 


16 


0.24 


-0.81 


1.09 


0.45 


2 . 15* 


. PSSC 


35 


0.47 


-1.04 


2.70 


0.69 


4.08* 


HPP 


2 


0.28 


0.27 


0.30 


0.02 


19.00* 


PSNS 


3 


0.11 


0.08 


0.15 


0.03 


5.20* 



* Value is significant at the a priori alpha level (j.05). 



Thus, Table contains a summary of the number of effect sizes 
calculated for each curricula (N), the mean effect size, the maximum 
and minimum effect size, and the standard deviation of the effect size 
around the mean. Also listed is the t-value for the test of statisti- 
cal difference between the mean effect size calculated and zero. 

Recall that by definition the effect size is a measure of the 
mean differences in performance Between students in new science curric- 
ula and students in traditional courses divided by the within group 
standard deviation of the control group (traditional course). Thus, 
an effect size of zero indicates that there were no observable 
differences between the two groups for the composite performance 
measures. A positive effect size signifies that students in the 
treatment group (new curricular group) performed better than the 
control group for the observed measures of student performance; whereas, 
a negative effect size signifies that student scores in the control 
group (traditional course) were higher. 




71 



The composite data in Table 4 cle<j?ly indicate that students who 
were exposed to new science curricula performed better than their 
traditional course counterparts. Disregarding curricula where only one 
effect size was calculated (FHESP, TSM,"and junior high conservation 
education/environmental education) the average composite student per- 
formance measure effect sizes range from A = 0.00 for IPS to A = 2.32 
for BSCS (Blue Version) . It should also be noted that the most heavily 
studied curricula (N > 15) also show a definite positive impact. The 
average composite student performance effect sizes for these curricula 
range from A = 0.00 for IPS to A = 0.55 for USMES with 80% of the 
curricula in this category being statistically significant from zero 
at the a priori alpha level of 0.05. Furthermore, if these average 
effect sizes are translated into percentile scores, the average USMES 
student performed better than 71% of the traditional course students 
whereas the average BSCS (Blue Version) student performed better than 
99% of th eir traditional course counterparts * 

How do students exposed to new science curricula compare 
to students exposed to traditional science curricula on 
xiarious performance criteria? 

Another general indicator of new curriculum effectiveness is pro- 
vided in the breakdown of effect size data for each of the 18 perfor- 
mance criteria measured. An examination of the mean effect .sizes in' 
Table 5 indicates that the new curricula had a positive impact on 
student performance for every performance criteria except for student 



ERIC 



i 't 



72 



TABLE 5 



EFFECT 


SIZE 


DATA 


FOR PERFORMANCE 


CRITERIA 






CRITERION 


* N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S 


.D. 


t-value 




















Poem i "t* t ifo.l /-\t.t 
uwgHJ. L-LVc— LiOW 


Q 

o 


0 


.02 


-0.46 


0. 50 


0 


.29 


0.20 


^ u 6ill L -LVe— fl-Lgn 


11 


0 


• 05 




f\ il 1 
U.Hl 


U 


.28 


0.60 


Cognit ive-Mixed 


111 


u 


It 0 
• HO 


-1.04 


4 1ft 


n 


.77 


5.83* 


Perceptions : 


















nrrec l j.ve— oliD jec l 


6 


0 


.51 


n on 


U.oD 


u 


32 


3.87* 


Affect ive-Science 


25 


U 


• DO 


0 11 


1 7R 

X • / >J 


n 

u 


36 


6. 89* 


Affective-Method 


JO 


o 




-0-81 


1.20 


0. 


JO 




Self -Concept 


10 


-0 


.08 


-0.82 


0.82 


0. 


53 


- 0. 51 


Process Skills: 


















Techniques 


28 


0 


.61 


-0.10 


2. 50 


0 . 




U. ft *3;'r 


Methods of Science 


28 


A 

u 




-0.62 


0.73 


0. 


32 


2 79* 




















V/i l Liual I Ul.ilK.Lng 


o i 
o± 


0 


. 19 


_0 ,35 




u . 


37 


2.77* 


Problem Solving 




0, 


71 


O.06 


1.41 


0 . 


/o 


2. 02 


Keiatea Skills : 


















Roa H t rk fT 
i\cau iiig 


Z 0 


0. 


10 


-0.41 


0.92 


0. 


24 


1.99* 


Mathematics 


18 


0. 


40 


-0.50 


4.48 


I. 


07 


1.59 


Social Studies 


2 


0, 


25 


0.25 


0.26 


0. 


00 


51.00* 


Communications 


-• 5 


0. 


40 


0.08 


0.75 


0. 


26 


3.47* 


Miscellaneous: 


















Creativity 


5 


0. 


71 


0.18 


1.50 


0. 


50 


3 . 22* 


Spatial Relations 


2 


0. 


57 


0.29 


0.86 


0. 


40 


2.02 



'•Value is significant at the a priori alpha level (0.05). 



9 

ERIC 



73 



self -concept. Eleven of these positive differences were found to 
be statistically significant from zero. 

The small number of effect sizes available for some criteria may 
limit a meaningful interpretation of those criteria (e.g., spatial 
relations and social studies). However, the consistent pattern of 
positive effect siz* values clearly establishes the superiority of 
the new science curricula over traditional courses in enhancing 
student performance over a broad range of performance measures. 

Especially interesting in the composite data of Table 5 are the 
statistics for general achievement (N = ill). Much criticism 
regarding the new science curricula focused on the apparent decline 
of general science knowledge among students exposed to the new 
program-. At the height of the new curricular movement (and even 
today) the prevailing notion was that the process goals of the new 
science curricula were being achieved at the expense of the content 
goals — although no comprehensive data base existed for either claim. 
The data in Table 5 show clearly that students exposed to new science 
curricula achieved 0.H3 standard deviations above (exceeding 67% of 
the control group) or, nearly one-half of a grade level better than, 
their traditional curriculum counterparts. 

In the areas where most new curriculum opponents would concede 
superiority, Table 5 indicates consistently positive effect size 
patterns. Student attitudes toward the subject specifically, science 
generally, and the new format of the courses (method) all show statis- 
tically significant positive results. Similarly, the areas involving 



( 0 



24 



higher cognitive skills (e.g., problem solving, critical thinking, 
logical thinking, and creativity) show consistently positive effect 
size patterns. Even student performance in related areas such as 
reading, mathematics, and cojmmunicdtion skills, areas in which new 
curriculum proponents often purported student gains, show positive 
effect size data. 

The slightly negative effect size mean for the self -concept data 
appear, at first, to be an anomaly when considered with the other 
affective measures regarding student attitude toward the specific sub- 
ject, science, and methods. However, in the majority of the studies 
coded, the self-concept measures assessed global self-concept rather 
than subject-specific self-concept. Thus, one would not expect the 
global self-concept to change dramatically during a period of 21-36 
weeks (the average length of treatment in the studies coded). In 
fact, when considering the goals and objectives of the new curricula 
and the emphasis upon student decision making, one might expect 
student self-concept to be deflated a little at the outset and duration 
of a course. Thus, the slightly negative, near 2e.r0 effect size (A = 
-0.08) for student self-concept appears to be predictable and 
reasonable . 

The cumulative effect size data in Table 5 make it possible to 
examine the impact new science curricula had on specific areas of 
student performance such as achievement, attitudes toward science, 
techniques of science, or critical thinking. However, the small 
number of effect sizes focusing on certain individual performance 
parameters (e.g., problem solving, N = spatial relations, N = 2) 



9 

ERIC 



I i 



75 



and the obvious relationship between other performance parameters 
suggested the need for a smaller number of more broadly defined 
criterion clusters. Moreover, the larger number of effect sizes within 
criterion clusters facilitate the examination of more detailed ques- 
tions regarding new curriculum effectiveness. The criterion clusters 
and the individual performance parameters comprising the clusters are 
listed below: 

Achievement Cluster 

a. Cognitive-low (Recall of facts, laws, principles) 

b. Cognitive-high (Application, synthesis, evaluation) 

c. Cognitive-mixed (General achievement) 

Perceptions Cluster 

d. Affective-attitude toward subject 

e. • Affective-attitude toward science 

f . Affective-attitude toward method/class environment 

g. Affective-attitude toward self (self-concept) 

Process Skills Cluster 

h. Techniques of science (lab skills, measurement) 

i. Methods of science 

Analytic Skills Cluster 
j . Critical thinking 
k. Problem solving 

Related Skills Cluster 

1 . Reading ( comprehension/read iness ) 

m. Mathematics (concepts, skills, applications) 

n. ^ocial studies (content, skills) 

o. Communication skills (reading, writing, speaking) 

\ 



ERIC , 7o 



76 



Miscellaneous 
p. Creativity 

q. Logical thinking (Piagetian tasks) 
r. Spatial relations (Piagetian tasks) 

Table 6 contains descriptive statistics and t-test data for effect 
sizes grouped by criterion clusters. The graph of the effect size 
means for the clusters is shown in Figure 1, Consistent with the 
Table 5 data for individual performance criteria, the criterion clusters 
data indicate that students exposed to new science curricula consis- 
tently outperformed students exposed to traditional courses. 

How do students exposed to specific new science curricula 
compare to students in traditional science courses on the , 
six criterion cluster measures (i.e.* achievement* per- 
ceptions*. process skills* analytic skills* related skills* 
other areas)? 

The analysis of effect size data for specific curricula by 
criterion culsters is inherently interesting because of the detail 
provided. The increased detail is accompanied by a decrease in 
available studies from which effect size data can be extracted. With 
18 separate criterion variables studied across the 27 new science 
curricula coded in this study, the full matrix would require 486 effect 
size calculations to place a minimum of one effect size in each cell. 
Even with the clustering of effec^ size data across related dependent 
variables, many of the possible cells yielded no data. 



TABLE 6 

EFFECT SIZE DATA FOR CRITERION CLUSTERS 
ACROSS ALL CURRICULA 



VARIABLE 


N 


MEAN A 


MINI- 

Mr tin A 
MUM £A 


MAXI- 

lit 14/ \ 

MUM A 


S 


D. 


t-value 


Achievement 


130 


0.37 


-1.04 


4.18 


0. 


73 


5 


. 76* 


Perceptions 


51 


0.37 ' 


-0.82 


1.75 


0. 


49 


5 


. 40* 


Process Skills 


56 


0.39 


-0.62 


2.50 


0. 


56 


5 


. 17* 


Analytic Skills 


35 


0..25 


-0.36 


1.44 


0. 


44 


3 


. 29* 


Related Skills 


48 


0.25 


-0.50 


4.48 


0. 


69 


2 


.51* 


Other Areas 


21 


0.33 


-0.70 


4.50 


0. 


51 


2 


. 93* 



'•Value is significant at the a priori alpha level (0.05). 



Achievement Cluster 

Table 7 lists the effect size data for the achievement criterion 
cluster for the 20 new science curricula for which such data were 
available . The mean effect size was positive for all but two of the 
curricula (FHESP, N = 1, A = -0.06 and TME,- N = 2, A = -0.11) 
indicating that students in the new science programs overwhelmingly 
outperformed students in traditional courses on achievement measures. 
The effect size results are especially impressive for students enrolled 
in the BSCS-Yellow (N = 19, A= 0.45), PSSC (N = 23*, A = 0,51) and 
SCIS (N = 5, A= 1.00) programs. The skeptic might dismiss these 
results due to inherent sampling problems caused by students gravi- 
tating to the new science programs such as BSCS or PSSC. Thus, the 



TABLE 7 

EFFECT SIZE DATA FOR THE ACHIEVEMENT CRITERION 
CLUSTER BY CURRICULUM 



% CURRICULUM 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S.D. 


t-value 


ESS 


3 


0.09 


0 01 


0. 


21 




1 O Ii 

1. 34 


SCIS 


5 


v ' 1.00 


0.05 


2. 


11 


n q i 
u • yjL 




SAPA 


12 


0.17 




1. 


65 


U • jo 


1 Ah 

1. 04 


.USMES 


3 


* 0 ^4 


nil 


0. 


51 


0 .21 


2* 74 


MINNEMAST 


2 


1 51 




2. 


17 


1 . ob 


1. 57 


SCIL 


4 


0 06 




0. 


20 


0 .11 


1.18 


ESTPSI - 


3 


0 28 


u • u / 


0. 


60 


0 .27 


1.79 


FHESP 


1 


-0.06 




— 


— 


U .00 




IPS 


3 


0.03 


-0 97 


0. 


20 


U • ZD 


ft oil 

0. 24 


ESCP '» 


6 


0.19 


-> 

-0 5? 


0. 


86 


n ii q 


ft Q*7 


IME 


2 


-0.11 


-0.33 


u • 


xu 


u . ou 




MSP 


1 


0.42 








0.00 




BSCS/SM 


2 


0.02 


0.01 


0. 


03 


0.01 


2.00 


BSCS/Y 


19 


0.45 


-0.19 


1. 


78 


0.51 


3. 67 * 


BSCS/B 


2 


3.94 


3.70 


1. 


18 


0.33 


16.42* 


BSCS/G 


2 


0.17 


0.00 


0. 


31 


0.21 


1.00 


BSCS/A 


1 


0.09 


-0.17 


0. 


43 


0.28 


0.62 


CHEM 


23 


■ 0.12 


-0.19 


0. 


92 


0.10 


1.37 


CBA 


10 


0.27 


-0.12 


1. 


09 


0.11 


2.05 


PSSC 


23 


0.51 


-1.04 


2. 


70 


0.77 


3.16* 


"Value is significant at the a priori alpha level (0.O5). 



potential for a sampling error owing tc superior students self- 
selecting a new, innovative science program must be considered as a 
threat to the internal Validity of new curriculum studies. The 



30 \ 

\ 

question of s^lf-selection bias is addressed separately in a late^r > 
analysis (see Table 28). An examination of the data in Table 7 across 
all curricula wo told suggest however, that self-selection errors in the 
original, studies were either not a factor, or they tended to produce 
inconsistent effects; since most of the effect size means are not 
statistically significant from zero. 

' Focusing on some of the more common new science curricula, the . 
data in Table 7 indicate substantial gains in achievement for. students 
in: SCIS (A = 1.00), USMES (A = 0.34), BSCS-Blue (A = 3.94, PSSC , 
(A = 0.51), BSCS-Yellow (A = 0.45), and CBA (A = 0.27). In all of 
these curricula, the average student in the new^ science curricula 
exceeds the achievement scores of 60% of the studehts in the control 
group (A = 0.25 is equivalent to the 60th percentile of the control 
group). Furthermore, other major curricula with moderate effects such 
as IPS (A = 0.03), ESS (A = 0*09), CHEM Study (A = 0.12), BSCS-Green 
(A = 0.17), J5-APA (A = 0.17) and ESCP (A = 0.19), equivalent to a 
percentile ranging from the "51st to the 58th percentile of the control' 
group ^should be viewed quite favored: y as support for the philosophy 
and effects of the new science curricula since the major objections 
of these programs centered around their lack of emphasis regarding 
science content. Clearly, the data in Table 7 indicates thac students 
enrolled in new science programs are not stifled in their acquisition 
of scientific knowledge. 



9 

ERLC 



'81 



Student Performance 

The data in Table 8 relate to the comparison of students in new 
science curricula versus students in traditional courses on attitudes 
toward the specific subject matter, the broad area of science, the 
classroom cliirate, and the students 'themselves ( self -concept ) . The 
analysis reveals significantly enhanced student attitudes in 5 of the 
9 new curricula where multiple effect sizes were coded. Of the more 
popular curricula, the S-APA (A = 0.39) and HSP (A = 0.66) showed the 
most positive effects while SCIS (A = 0.08) and CBA (A = 0.16) showed 
the least positive changes, 

All of the curricular means for the perceptions cluster reveal 
positive effects, in spite of the fact that negative effects were 
coded for SCIS, SCIL and CBA. Thus,, in terms of effective measures 
students .generally felt better about the specific course they were 
taking, the methods employed, science in general, and themselves while 
-'enrolled in a new science program. It should be noted however, that 
only 16 of the 27 curricula studied had been investigated within this 
large domain called perceptions (over 40% of these curricula had only 
1 effect size calculated). However, data in Table 20 would tend to 
indicate that student perceptions by grade level were greatly enhanced 
i.e., elementary grades (A - 0.28), junior high (A = 0.53) and high 
school (A = 0.44). 

Process Skills 

Process objectives have become synonymous with new science cur- 
ricula over tne years. The debate over the relative importance of 



32 

TABLE 8 

EFFECT SIZE DATA FOR THE PERCEPTIONS CRITERION 
CLUSTER BY CURRICULUM 



OUKKiCULUM 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S. 


D. 


t-value 


'ess 


1 


0.51 






0. 


00 


— 


— 


SCIS 


14 


0.08 


-0.82 


0.82 


0. 


52 


0. 


59 


SAPA 


6^ 


0.39 


0.00 


0.'76 


0. 


28 


3. 


35* 


USMES 


1 


0.15 






0. 


00 


— 


- 


IS 


2 


0.64 


0.60 


0.69 


0. 


06 


14. 


33* 


SCIL \ 


8 


0.61 


-0.32 


1.12 


0. 


44 


3. 


86* 


HSP 


4 


0.66 


0.46 


0.85 


0. 


18 


7. 


24* 




X 


»"» T «"7 

0.17 






0. 


00 






IPS 


2 


0.23 


0.21 


0.25 


0. 


02 


11. 


50* 


ESCP 


1 


0.11 






0. 


00 






I ME 


2 


0.55 


0.45 


0.66 


0. 


14 


5. 


29 


BSCS/Y 


1 


1.05 






0. 


00 






BSCS/G 


1 


1.75 






0. 


00 






BSCS/B 


2 


0.25 


0.21 


0.29 


0. 


05 


6. 


25 


CBA 


4 


0.16 


-0.81 


0.76 


0. 


69 


0. 


45 


PS'NS 


1 


, 0.15 






,0. 


00 







-Value is significant at the a priori alpha level (0.05). 



process skill development versus content knowledge acquisition drew 
considerable attention in the sixties and early seventies. The issue 
still stimulates discussion today even though many teachers have 
resigned themselves to a content emphasis. The data in Table 9 deal 
with the process s;ill record of the new science curricula. The 
research record clearly indicates a success story for most of the new 



9 

ERLC 



83 



TABLE 9 

EFFECT SIZE DATA FOR THE PROCESS SKILLS 
CRITERION CLUSTER BY CURRICULUM 



CURRICULUM 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S.D. 


t- value 


ESS 


4 


0. 


47 


0. 26 


0. 


70 


0 1ft 


0 . J_ 3 ¥> 


SCIS 


6 


0. 


56 


0.12 


1. 


20 


0, 36 


3 79* 


SAPA 


3 


1. 


08 


-0. 02 


2. 


50 


1. 28 


ItTO 


USMES 


1 


0. 


29 




— 


- 


0. 00 




SCIL 


4 


0. 


43 


0.38 


0. 


47 


0. 04 


18.34* 


ESTPSI 


3 


0. 


50 


0*15 


0. 


73 


0.30 


2.81 


ISCS 


1 


0. 


30 








O.OO 




o IPS 


5 


-0. 


08 


-0.44 


0. 


23 


0.30 


- 0.63 


ESCP 


8 


0. 


22 


-0.62 


0. 


52 


0.39 


1.60 


BSCS/SM 


1 


0. 


11 








O.OO 




BSCS/Y 


4 


0. 


72 


0.23 


1. 


76 


0.71 


2.01 


' BSCS/B 


A 


2. 


45 








O.OO 




CHEM 


5 


-0. 


03 


-0.33 


0. 


26 


0.22 


- 0.37 


CBA 


1 


0. 


34 








O.OO 




PSSC 


5 


0. 


35 


-0.10 


•1. 


19 - 


0.49 


1.60 


HPP 


2 


0. 


28 


0.27 


0. 


30 


0.O2 


19.00* 


PSNS 


2 

* 


0. 


09 . 


0.08 


0. 


10 


0.O1 


9.00 



*Va±ue is significant at the a priori alpha level (O.05), 



curricula, But not all curricula show equal success. Both the IPS 
(A - -0.08) and CHEM Study (A = -0.03) accumulated a record of nega- 
tive performance on process skills development. 

The slightly negative results for both the IPS and CHEM Study 
are especially interesting because of the emphasis both place on the 



t 84 



integration of laboratory activities in total course work. The lack 
of success shown by the studies of the curricula suggest ineffective 
curricular materials or improper implementation of potentially 
effective materials. Again, the lack of detailed information describ- 
ing the treatment conditions (i.e., with how much integrity were the 
new materials implemented) prohibits a thorough investigation of the 
alternatives. 

The remaining data in Table 9 follow a trend similar to the 
achievement data in Table 7 where effect size data are available on 
both criteria for a particular curriculum. The curricula showing 
strong positive effect size values in achievement show similar values 
in process skill measures (e.g., SCIS A A = 1.00, A DC = 0.56: BSCS- 
Yellow A A = 0.^5, A p#s# = 0.7^; PSSC ^ = 0.51, A p g = 0.35). 

Analytic Thinking 

Perhaps the area most stressed by new science curriculum devel- 
opers in the golden years of curriculum reform was problem solving and 
critical thinking. Capturing just the right mixture of text material, 
laboratory activity, and stimulating problems was the dream of every 
curriculum engineer and the challenge awaiting every 'new science 
curriculum teacher. Were the new materials being developed in the 
post-Sputnik years any more effective in cultivating student analytic 
thinking skills than the traditional courses -they were replacing? 
Many new science curriculum 1 appeared and disappeared before that ques- 
tion was fully explored. In fact, it is questionable that the issue 
was really explored fully considering Table 10 indicates only 35 



8, 



i85 



TABLE 10 

EFFECT SIZE DATA FOR THE ANALYTIC THINKING 

« 

CRITERION CLUSTER BY CURRICULUM 



CURRICULUM 


N 


MEAN A 


MINI- 
MUM A 


* MAXI- 
MUM A 


S.D. 


t-value 


SAPA 


1 


0.06 






COO 




ISCS 


1 


0.07 






0.00 




IPS 


5 


-0.15 


-0.36 


0.12 


0.22 


- 1.57 


ESCP 


7 


0.16 


-0.05 


0.44 


0.18 


2 . 31* 


CE/ES 


1 


0.01 






0.00 




MSP 


1 


0.12 






0.00 




BSCS/SM 


1 


0.J9 






0.00 




BSCS/Y 


3 


0.42 


0.03 


1.08 


0.57 


1.27 


BSCS/B 


2 


0.94 


0.44 


1.44 


0.70 


1.88 


BSCS/6 


1 


-0.18 






0.00 




CHEM 


5 


0.30 


-0.08 


0.75 


Q.32 


2.08 


CBA 


1 


0.21 






0.00 




PSSC 


6 


0.53 


0.01 


1.41 


0.61 


2.12 



-Value is significant at the a priori alpha level (0.05). 



codable effects addressing the question of new science curriculum 
impact on student analytic thinking skill with only 5 curricula 
revealing more than one effect size. 

The four most frequently studies curricula (IPS, ESCP, CHEM, and 
PSSC) show slightly mixed results. The IPS studies showed an overall 
negative impact while the other three showed a positive effect with 
PSSC being the highest at 0.53 standard deviations. Perhaps the 
most surprising data on analytic thinking are those for the BSCS 

ho 



86 



curricula. The 7 studies on these curricula produced a mean effect 
size of A = 0.46 second only to physics (A = 0.53). For a subject 
area generally considered non-quantitative at the high school level, 
these results are very impressive. 

Related Skills ^ 

The related skills cluster contains those studies conducted to 
determine the effects of new science curricula on mathematics skills, 
reading ckills, social studies performance, and communications skills 
(e.g*, writing and speaking). The promise of enhanced student per- 
formance in related skill areas was never advertised loudly by new 
curriculum proponents, but the inference that gains in these areas 
could be achieved as an added benefit was able to be concluded from 
much of the early rhetoric. 

As Table 11 indicates, only three of the new science curricula 
actually were studied to any extent for their impact on related skill 
areas: SCIS, SAPA, and USMES, all elementary level programs. While 
the mean of the USMES study effect sizes is the most impressive (A 
= 0,66), the top-end value of 4.48 and the resulting standard deviation 
for the effect c-ize data of 1.25 leaves the overall mean somewhat 
suspect. It is probably safe to, conclude however, that student per- 
formance in related skill areas was positively enhanced through their 
participation in such curricula. 



87 



TABLE 11 

EFFECT SIZE DATA FOR THE RELATED SKILLS 
CRITERION CLUSTER BY CURRICULUM 



CURRICULUM 


h 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S.D. 


t-value 


SCIS 


13 


0.21 


-0.01 


0.54 


0.15 


5.00* 


SAPA 


18 


0.10 


-0.41 


0.75 


0.29 


1.46 


USMES 


12 


0. 66 


-0.01 


4.48 


1.25 


1.84 


ISCS 


2 


-0.03 


-0.10 


0.03 


0.09 


- 0.54 


MSP 


1 


, IX 






0.00 




BSCS/Y 


1 


-0.50 






0.00 




PSSC 


1 


0. 04 






0.00 





-Value is significant at the a priori alpha level (0.05). 



Other Performance Areas 

As indicated by the relatively small number of effect size values 
listed in Table 12, the nuiftber of studies focusing on non-conventional 
measuxes of student performance is relatively small. Fourteen of the 
effect sizes included in the table are derived from studies using 
Piagetian-type tasks and 5 of the studies utilized creativity measures 
as a dependent variable. The paucity of studies on these variables 
for any one curriculum makes meaningful synthesis difficult. Perhaps 
the most significant conclusion to be drawn from these data is that 
more experimental studies need to be conducted using these criteria 
as dependent variables. 



88 



TABLE 12 

EFFECT SIZE DATA FOR THE OTHER MENTAL FUNCTIONS 
CRITERION CLUSTER BY CURRICULUM 



CURRICULUM 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S 


.D. 


t -value 


ESS 


3 


0.48 


0.06 


0. 


81 


0 


,38 


2.18 


SCIS 


7 


0.18 


-0.55 


0. 


86 


0 


.55 


0.88 


SAPA 


5 


0.50 


-0.O9 


1. 


50 


0 


62 


1.82 


TSM 


1 


0.49 








0 


00 




ISCS 


1 


0.74 








0 


00 




IPS 


2 


0.35 


0.26 


0. 


44 


0 


12 


3.89 


ESCP 


2 


-0.13 


-0.70 


0. 


44 


0 


80 


- 0.23 



How do students exposed to new science curricula of a 
particular content area (physics, chemistry j biology j 
earth science, etc. ) compare to students in traditional 
science courses? 

Effect size data are grouped by the science content area for the 
analysis reported in this section. For each study reviewed and coded, 
the content area represented by the curriculum under study was placed 
in one of the following categories: 

(1) Life science 

(2) Physical science 

(3) General science 
(*0 Earth science 
(5) Biology 



9 

ERLC 



89 



I 



i 

(6) Chemistry ^ 

(7) Physics \ 

\ 

As was mentioned earlier\ the ground rule in coding dictated that the 

\ 

study be labeled with the h%hest degree of specificity rather than 
the most general. Thus, PSSC physics ?< 0t^^^imUt%^ % ossified 
as physics an*d chemistry, respectively, not physical science. Elemen- 
tary science curricula were coded as general science. 

L ife Sciences 

This grouping of science curricula includes those dealing with 
topics such as health science and junior high life science. 

As indicated in the data in Table 13 the student perceptions 
criterion cluster is the only one for which multiple effect sizes were 
calculated. The strong positive effect size mean (A = 0.66) indicates 
that the students exposed to the new life science programs developed 
more positive attitudes about science than students participating in 
the standard health and life science programs. 

Physical Science 

The curricula included in the physical science category represent 
junior high school programs for the most part. Because of the recent 
interest in the junior high/adolescent student expressed by funding 
agencies, these data and the subsequent grade level analyses found 
later in this report are significant. The two criterion clusters 
showing the most dramatic differences among th. effect size data 
reported for the physical science curricula in Table 13 are achievement 



erJc 



90 



and perceptions. The combination of these two sets of performance 
data suggest that the new curricula represented by these junior high 
studies had a positive impact on the student participants. The only 
negative impact is found in the area of problem solving/critical 

thinking where a slightly negative effect size appe^p (A= -0.10). 

b 

General Science 

Just as the physical science curricula are most characteristic of 
junior high programs, the general science curricula are comprised 
mostly of the elementary school programs. Perhaps the most revealing 
statistic of the Table 13 data is that there are 143 effect sizes 
included under the general science category almost half of all the 
effect sizes calculated from the codable studies • The relative wealth 
of research in this content area is most likely a function of the 
numbers of students enrolled in elementary science programs compared 
to those enrolled in upper grade level programs and the accessability 
of elementary school populations. 

If a consistency of performance of new curricula in any content 
area is sought, the general science area is far and away the winner. 
In all 5 performance areas where multiple studies were located, the 
effect size data indicate that students participating in the new pro- 
grams performed significantly better than their traditional course 
counterparts. The performance of the average elementary student in 
the new science curricula coded exceeds 61-72% (A= 0.27 to A = 0.59) 
of the students in traditional science courses for these 5 criterion 
clusters . 



9 

ERLC 



91" 



Earth Science 

Earth science curricula tend to be used at the 9th grade level 
although some of the new earth science programs have been used as high 
school elect ives and as advanced 7th/8th grade courses. As the data in 
Table 13 indicate, studies performed in the earth science area are the 
only ones that produced statistically significant differences in the 
analytic skills criterion cluster (the high school science areas, 
biology, chemistry, and physics each produced substantial positive 
effect sizes, but not statistically significant from zero). 

Contrasted to the significant results produced in the area of 
analytic skills, the earth science curricula also distinguish them- 
selves as the only content area for which a positive achievement result 
was not achieved (A = -0.07); however*, this mean is not significantly 
different from zero. 

Biology 

New science curricula in biology are synonymous with BSCS* The 
collapsed category of biology represents a composite view of research 
completed on the various versions of BSCS (Special Materials, Yeilow, 
Blue, Green, and Advanced). Of the high school science programs 
developed, more codable research was found for BSCS programs than any 
other single project. 

An examination of the data in Table 13 reveals an impressive 
track record regarding the research results on the BSCS programs. 
Where multiple effect sizes of a performance cluster exist, the mean 
effect size values are consistently high (A >0.U6). One of the more 



interesting positive effects of BSCS is in the area of analytic 
thinking. The studies coded yielded an effect size mean of 0.46* 
This mean is higher than that generated for all chemistry curriculum 
research reviewed (A = 0*28) and approaches the mean of the physics 
program research (A= 0.53). Considering 4 that traditional biology is 
noted for its preoccupation with facts and labels, the mean of 0.46 
is quite an impressive turn-around. 

Chemistry 

Two chemistry curricula comprised the market of new curricula 
during the decade of the sixties: CHEM Study and CBA. Of the three 
* traditional high school subject areas (biology, chemistry, and 
physics), it is probably safe to conclude on the basis of data in 
Table 13 that the new chemistry curricula produced the least impact 
in terms of enhanced student performance. The mean effect size of 
0.16 for the studies on achievement, while statistically significant 
from zero, is not as impresive as Biology (A = 0.59) or Physics (A = 
0.50). Only the achievement data for the earth science curricula 
(A = -0.07) yielded a smaller mean effect size. 

An even less impressive figure for the new chemistry programs is 
in the area of process skills. Recall that process skill measures an 
those reflecting an understanding of and familiarity with laboratory 
procedures, designing, executing, and interpreting experiments, and 
problem solving procedures that involve active participation. One 
would expect this to be a forte of the new chemistry programs* This 
does not appear to be the cdse. While the mean effect size of 0.28 

IK* 



93 

C 

in the area of analytic skills is respectable, it toes not make up 
for the dismal record in the process skill area. 

Physics 

Studies of two physics currichla, PSSC and HPP were coded for 
this meta-analysis. The data in Table 13 include 35 effect sizes 
generated from PSSC studies and 2. from HPP studies. Interestingly,, 
no codable studies were found which dealt with student perceptions. 
Apparently, researchers were interested more in the cognitive per- 
formance areas than in the affective domain. Yet, a grave concern 
still exists over the declining enrollments in science and especially 
in high school physics. 

The performance chart of the new physics programs is second only 
to the biology curricula .In the overall positive effect size pattern. 
Studies of achievement and analytic ' skills yielded mean effect sizes 
of about a half standard deviation. Translated into grade equivalent; 
this means students participating in the new physics courses effec- 
tively gained a half-year of study on their traditional course class- 
mates in terms of general physics achievement and analytic thinking 
skills. The decline, perhaps demise is a better work, of the new 
physics programs should not be attributed to a lackluster performance 
based on the research reviewed! 



94 



TABLE 13 

EFFECT SIZE DATA FOR PERFORMANCE cKITERION 
CLUSTERS BY CONTENT OF CURRICULA 



CLUSTER 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S. 


D. 


t-value 


Life Science 




















Perceptions 




n 
u • 


DO 


r\ 

U . 


Ho 


0.^85 


0. 


18 


. 7.2H* 


Analytic Skills 


1 


u . 


UJL 









0. 


00 





Phvsical Science 






















9 


0. 


31 


-0. 


33 


1 14 


0 


U7 


1 97 


Perceptions 


8 


0. 


31 


0. 


15 


0.66 


0. 


16 


5 Ufi* 




1 0 


0. 


08 


-0. 




0 78 


o 






Analytic Skills 


7 


ilA 


1 A 


A 

-u • 


oo 


' 0.12 


0. 


21 


- 1.32 


Other Areas 


6 


u • 


/ / 


A 

-u • 


/U 


0.7H 


0. 


50 


1.35 


General Science 




















Achievement 


32 


0. 


35 


-0. 


57 


2.H7 


0. 


68 


2.95* 


Perceptions 


30 


0. 


32 


-0. 


82 


1.12 


0. 


49 

O 


3,55* 


Process Skills 


19 


o 


59 


0. 


■la a- 


2.50 


0. 


52 


H.95* 


Analytic Skills 


1 




uo 









0. 


00 





Related Skills 


H6 


0 . 


27 


r\ 
-0 . 


m 


4.H8 


0. 


69 


2.67* 


Other Areas 


15 


0. 


35 


(- 


55 


1.50 


0. 


53 


2.53* 


Earth Science 




















Achievement 


4 


-0. 


07 


-0. 


52 


0.27 


0. 


32 


- 0.45 


Perceptions 


1 


0. 


11 








0. 


00 




Process Skills 


8 


0. 


22 


-0. 


62 


0.52 


0. 


39 


1.60 


Analytic Skills 


7 


0. 


16 


-0. 


05 


0.HH 


0. 


18 ' 


2.31* 


Biology 




















Achievement 


29 


/ o. 


59 


' -0. 


H9 


U.18 


1. 


OH 


3.07* 


Perceptions 




f 

0. 


82 


''0. 


21 


1.75 


0, 


72 


2.28 



(continued) 




95 



TABLE 13 (Continued) 



CLUSTER 


N 


MEAN A 


MINI- 
MUM A 


MA VT 

MUM A 


S.D. 


t -value 


Process Skills 


6 


0.90 


0. 


11 


2.45 


0.96 


2. 


29 


Analytic Skills 


7 


0.46 


-0. 


18 


1.44 


0.58 


2. 


09 


AcJLa UcU OA.XXX0 


1 
















Chemistry 


















Achievement 


33 


0.16 


-0. 


49 


1.09 


0.40 


2. 


28* 


Perceptions 


4 


0.15 


-0. 


81 


0.76 


0.69 


0. 


45 


Process Skills 


6 


0.02 






0.34 


0.25 


0 


24 


Analytic Skills 


6 


0.28 


-0. 


08 


0.75 


0.29 


2. 


40 


Physics 


















Achievement 


23 


0.50 


-1. 


04 


2.70 


0.77 


3. 


16* 


Process Skills 


7 


0.33 


-0. 


10 


1.19 


0.40 


2. 


18 


Analytic Skills 


6 


0.53 


0. 


01 


1.41' 


0.61 


2. 


12 


Related Skills 


1 


0.04 








0.00 







* Value is significant^ at the a priori alpha level (0.05). 



Compared to students in traditional science courses* how is 
student performance affected by the level of emptiaais on 
inquiry, process skills* laboratory* individualization* and 
content across new science curricula? 

Studies focusing oft the effectiveness of a particular science 
program such as PSSCf, CHEM, or BSCS dominate the research literature 
on new science curricula. The difficulties of doing large scale 
research across curricular or content lines explains the abundance of 
the focused studies. However, there is interest in questions that 

9;j 



96 



cut across program and content lines. Questions such as, "how does 
the amount of - emphasis on inquiry affect student performance?" are of 
interest. But they defy easy investigation because' of the large 
samples needed to override the interactive effects of any one program 
or science area. This is not practical in original research. The 
quantitative synthesis of study results permits a post hoc analysis of 
such questions even though the original studies may not have focused 
on the issue. 

Using the ratings of a panel of five science educators, profiles 
of the science curricula encountered in the research studies analyzed 
were constructed. Each curriculum was rated on a scale of 1 (low) to 
4 (high) on the level of emphasis on: (i) inquiry, (2) process skills 
(3) laboratory, (4) individualization, and (5) content. With the 
available^ information, profiles were constructed for 21 of the 27 
curricula encountered. The profiled curricula accounted for 306 of 
the 341 effect sizes available for analysis. 

Effect size data were grouped as "high 11 (3 or 4) or "low 11 (1 or 
• 2) on each of the five profile factors and analyzed for each perfor- 
mance criterion cluster. The effect size data for each separate 
profile factor are listed in Tables 14-18. 

Inquiry Emphasis 

Since the explanation of the curriculum profile data constitutes 
an exhaustive report by itself, a brief presentation of some of the 
major features of the analyses are presented here. Perhaps the first 
point of interest lies in the profile ratings themselves. The panel 

9 j 



.97 



evaluating the various curricula rated 73% of the new curricula low 
on inquiry, 80% high on process skills, 93% high on laboratory 
emphasis, 78% low on individualization, and 73% high on content 
emphasis. Certainly, these ratings reflect the bias of the panel and 
represent a source of error in interpreting the results. <, Especially 
interesting are the low ratings on inquiry and individualization — 
ratings that run counter to the original goals purported by curriculum 
architects and assumed by uninformed teachers and lay people. 

The data on the inquiry factor in Table 14 do not reveal any 
overall pattern of performance but do show data of interest in the 
perceptions and related skills areas. With an even distribution of 

high and low rated curricilla (N = 18), the 0.42 effect size associated 

i 

with curricula with a low ( rating on inquiry appears to be considerably 

- . I 
more positive than that associated with curricula rated high on 

inquiry. Assuming traditional curricula would receive a low inquiry 
rating by these same panelists , these data suggest that the positive 
affective student response to new science programs is not a function 
of the inquiry nature of the new materials, but of some other 
factor(s). This relationship was explored more fully by correlating 
the curriculum profile ratings with the effect size data across all 
performance measures and by each performance cluster. As indicated 
by the correlation data reported in Table 19, student perception effect 
size data consistently correlate negatively to the four factors most 
often attributed to new science curricula (inquiry, process orienta- 
tion, laboratory emphasis, and individualization). 



98 

3 



The related skills data are just the opposite of the student per- 
ception data when analyzed by level of emphasis on inquiry* Effect 
size values are considerably higher (A = 0.42 versus A = 0.06, Table 14) 
for studies conducted on curricula with a high inquiry rating. The 
mean data for effect size grouped by emphasis on Content (Table 18) 
and the correlation data in Table 19 substantiate a firm pattern which 
shows that student performance in related skill areas is positively 
affected in science curricula which emphasize inquiry and negatively 
affected in curricula which stress content. 

Process Emphasis 

The effect size analysis on the curriculum profile rating process 
-skills (Table 15) has two interesting features in addition to the 
predictable firtding that student performance on process measures is 
enhanced considerably in curricula which stress process skillo ( A = 
0,50 (high) versus A = 0.12 (low)). The analysis indicates that 
student performance on analytic skill and related skill measures is 
increased significantly when the curricula are tfated high on process 
skill emphasis (A = 0.38 versus A = 0.06 for analytic skills; cind, 
A = 0,27 versus A= 0.01 for relate'd skills). The strong positive 
correlation between analytic skill performance and curriculum process 
skill profile is borne out in Table 19 as well (r = 0.34). This 
strong process orientation-analytic skill performance relationship is 
one that many proponents ^of the new curriculum movement purported, 
that many opponents doubted, and that few studies ever addressed. 



^ - 

99 



Laboratory Emphasis 

The* role of the laboratory in school science is an issue that 
generates emotional debate at all levels of science education: Is it 
or is it not a critical component of instruction? The effect size 
data analyzed by the degree of emphasis placed on the laboratory in 
new science curricula are presented in Tables 16 and 19* An examina- 
tion of the means in Table 16 reveals a definite pattern: in the four 
performance areas where a dichotomy of high and low emphasis on 
laboratory could be established, students participating in studies 
where the curriculum under investigation was rated "low" on laboratory 
emphasis consistently outperformed students participating in studies 
where the curriculum involved was rated "high, 11 Notice, however, 
that the means for the low laboratory emphasis curricula were based , 
on relatively few effect sizes compared to the high emphasis and 
that none of the low emphasis meaps were significantly different 
from zero. Recall that only 7% of the effect sizes (N = 21) were 
calculated from studies involving curricula receiving a low rating 
on laboratory emphasis. The correlation data in Table 19 further 
suggests that the increased laboratory emphasis of the new science 
curricula is not necessarily a case of "more is better." Indeed, the 
negative correlation between laboratory emphasis ratings and effect 
size data on student perceptions criteria (r = -0.3 1 *) suggest that 
students aren't as positive about new science program experiences 
when the emphasis on laboratories is increased. 




Individualization Emphasis 

Table 17 contains the results of effect size data analyzed by 
the degree of individualization of the new curricula. As evidenced 
by the data, achievement results are slightly enhanced in curricula 
judged higher on individualization while analytic skills are .not* 
Tn e to form, student perceptions appeal? to be adversely affected by 
increased individualization (A= 0.40 (low) versus ^A = 0.11 (high))* 
This observation is supported by the r = -0.39 correlation coefficient 
in Table 19. 

Content Emphasis 

Table 18 contains the effect size data analyzed by the level of 
emphasis on content. The data ikveal substantial disparities in 
only two areas: student perceptions (A = 0.42 (high) versus A = 0.12 
(low)) and related areas (A = 0.42 (low) versus A= 0.06 (high)). 
However, the slightly more positive effect size for the low content 
..emphasis curricula on achievement (A = 0.42) may be the most signifi- 
cant value when viewed from the new curriculum developers perspective. 
The fact that achievement scores do not plummet when the emphasis on 
content is reduced, but in fact, actually increase, could be con- 
sidered a significant accomplishment if substantial gains in other 
performance areas such as student attitude and problem solving and 
process skills are also realized! 



9 

ERIC 



l'fo 



101 



TABLE 14 

EFFECT SIZE DATA FOR CRITERION CLUSTERS EY/CURRICULUM 
PROFILE : EMPHASIS ON INQUIRY ' 





RATING 


N 


MEAN 

A 


MINI- 
MUM A 


MAXI- 

MT1M A 

nun l\ 


S.D. 


t-value 


Achievement 


High 


16 


0.41 


-0.06 


2.41 


0.64 


2.58* 




Low 


106 




i nil 




U . / / 


5 . 03- 


Perception 


High 


18 


0.13 


-0.82 


0.82 


0.47 


1.16 




Low 


18 


0.42 


u • oJL 


JL. / O 


0 . 51 


3 . 46" 


Process 


High 


1 2 

JL a. 




nil 


l on 

JL . Z U 


U • oU 




Skills 




35 












Low 


0.36 


-0.62 


2.50 


0.68 


3.14* 


Analytic 


High 


2 


0.05 


-0.18 


0.29 


0.33 


0.23 


Skills 


Low 


31 


0.27 












-0.36 


1.44 


0.46 


3.26* 


Related 


High 


25 


0.42 


-0.09 


4.48 


0.88 


2.42* 


Skills 


Low 


23 












0.06 


-0.50 


0.75 


0.29 


1.02 


Other 


High 


10 


0.27 


-0.55 


0.86 


0.50 


1.70 


Areas 


Low 


10 


0.37 










-0.70 , 


1.50 


0.57 


2.05 



•Value is significant at the a priori alpha level (0.05). 



9 

ERIC 



102 



TABLE 15 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY CURRICULUM 
PROFILE: EMPHASIS ON PROCESS SKILLS 



CLUSTER 


RATING 


N 


MP AM 

A 


MINI- 
MUM A 


M A VT 

MAXI- 
MUM A 


S.D. 


t-value 


Achievement 


High 


103 


0.39 


-1.04 


4.18 


0.77 


5.09* 




Low 


16 


0.35 


-0.52 


2.47 


0.66 


2.32* 


Perceptions 


High 


28 


0.28 


-0.82 


1.75 


0.52 


2.85* 




Low 


8 


0.25 


-0.81 


0.76 


0.49 


1.45 


Process 


High 


33 


0.50 


-0.3.3- 


2.50 


0.65 


. 4.39* 


Skills 


Low 


14 


0.12 


-0.62 


0.52 


0.37 


1.22 


Analytic 


High 


20 


0.38 


•-0.27 


1.44 


0.52 


3.33* 


Skills 


Low 


13 


0.06 


-0.36 


0.44 


0.23 


0.92 


Related 


High 


45 


0.27 


-0.50 


4.48 


0.71 


2.54* 


Skills 


Low 


3 


0..01 


-0.10 


0.11 


0.10 


0.22 


Other 


High ' 


17 


0.35 


-0.55 


1.50 


0.50 


2.87* 


Areas 


Low 


3 


0.16 


-0.70 


0.74 


0.76 


0.36 



"Value is significant at the d priori alpha level (0.05), 



ERJC 



In 



TABLE 16 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY CURRICULUM 
PROFILE: EMPHASIS ON LABORATORY 



CLUSTER 


RATING 


N 


MEAN 


MINI- 

KIT 1M A 

MUM Q 


MAXI- 
MUM (\ 


S.D. 


t -value 


A c h i a vpirt An "t* 


Hicrh 


ill 

XXX 




• oz 


i nil 


1 . 70 


0 .62 


5. 56* 




JjlJW 


1 1 
xx 


0 


.96 


f\ It o 


4. 18 


1.52 


2.10 




Mi ah 


Ox 


n 


'Oil 


-0 . 82 


1. 05 


0.42 


3.17' ! 








0 


.47 


-u . bx 


1» 75 


0.93 


1. 14 


Process 


High r 


45 ' 


0 


.34 


-0.62 


2.50 


0.53 


4.33* 


Skills 


Low 
















2 


1 


.39 


0.34 


2.45 


1.49 


1.32 


Analytic 


High 


30 


0 


.21 


-0.36 


1.41 


0.42 


2 .79* 


Skills 


















Low 


3 


0 


.69 


0.21 


1.44 


0.65 


1.85 


Related 


* High 


48 


0 


.25 


-0.50 


4.48 


0.69 


2 . 54* 


Skills 


Low 
















0 














Other 


High 


20 


0 


.32 


-0.70 


1.50 


0.52 


2.73* 


Areas 


Low 
















0 















'•Value is significant at, the a priori alpha level (0.05). 



f 

i 

\ 
\ 



\ 




104 



TABLE 17 

.EFFECT SIZE DATA FOR CRITERION CLUSTERS BY CURRICULUM 
PROFILE: EMPHASIS ON INDIVIDUALIZATION 



CLUSTER 






MEAN 

A 


MINI- 
MUM A 


MAXI- 
MUM A 


S 


.D. 


t -value 


Achievement 


High 


n 


0.48 


-0.06 


2. 


41 


0 


76 


2.07 




Low 


in 


0. 37 


-1. 04 


4. 


18 


n 




o • 19« 


Perceptions 


High 


16 


0.11 


-0.82 


0. 


82 


0, 


50 


0,92 




Low 


20 


0.40 


-0.81 


1. 


75 


n 




Q C C 5*? 

o • bb** 


Process 


High 


14 


0.44 


0.11 


1. 


20 


n 

\j « 






Skills 


Low 


33 














0.36 


-0.62 


2. 


50 


0. 


70 


3.00* 


Analytic 


High 


2 


0.11 


-0.07 


0. 


29 


0. 


25 


0.61 


Skills 


Low 


















31 


0.26 


-0.36 


1. 


44 


0. 


46 


3 . 20* 


Related 


High 


16 


0.17 


-0.10 


0. 


54 


0. 




4 . 29* 


Skills 


Low 
















32 


0.29 


-0.50 


4. 


48 


0. 


83 


1.98* 


Other 


.High 


11 


0.31 


-0.55 


0. 


86 


o;. 


50 


2.08 


Areas 


Low 
















9 


0.33 


-0.70 


1. 


50 


0. 


59 


1.68 



'"'Value is significant at the a priori alpha level (0.05). 



XUQ 



TABLE 18 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY CURRICULUM 
• PROFILE: EMPHASIS ON CONTENT 



CLUSTER 


RATING 


N 


MEAN 


MTNT- 

MUM A 


MA YT_ 

MUM A 


S.D. 


t-value~ 


Achievement 


High 


107 


0.37 


-1.0'4 


4.18 


0.77 


5 


.07* 


• 


Low 


15 


0.42 


-0.27 


2.41 


0.67 


2 


.45* 


Perceptions 


High 


18 


0.42 


-0.81 


1.75 


0.51 


3 


.48* 


- 


Low 


18 


0.12 


-0.82 


0.82 


0.47 


1 


.14 


Process 


High 


c 35 


0.36 


-0.62 


2. 50 


0.68 


3 


.12* 


Skills 


Low 


12 


0.48 












0.12 


1.20 


0.29 ' 


5 


.66* 


Analytic 


High' 


32 


0.27 


-0.36 


1.44 


0.45 


3 


.44* 


Skills 


Low 
















1 


-0.27 


-0.27 


-0.27 








Related 


High 


23 


0.06 


-0.50 


0.75 


0.29 


1 


.02 


Skills 


Low 


25 


0.42 












-0.09 


4.48 


0.88 


2 


.42 


Other 


High 


7 


0.33 


-0.70 


1.50 


0.68 


1 


.28 


Areas 


Low 


13 


0.31 












-0.55 


0.86 


0.45 


2 


.51* 



*Vaiue is significant at the a priori alpha level (0.05). 



ERIC lite 



106 :~ 
TABLE 19 

-CORRELATIONS BETWEEN CURRICULUM PROFILE RATINGS AND 
EFFECT SIZES CALCULATED FROM STUDENT PERFORMANCE DATA 



PERFORMANCES ) N INQUIRY PROCESS LABORATORY INDIVIDUAL CONTENT 



Achievement 


122 


0.05 


0 


.06 


-0.07 


0. 


07 


0.04 


Perceptions 


36 


-0.39* 


-0 


.26* 


-0.33* 


-0. 


39* 


-0.12 


Process Skills 


• 47 


( 0.05 


0 


.16 


-0.07' 


0. 


04 


-0.03 


Analytic Skills 


-33 


0.00 


0 


.34* 


-0.07 


0. 


01 


0.16 


Related Skills 


48 


, 0.28* 


0 


.09 


0.00 


-0. 


08 


-0.28* 


Other Areas 


20 


-0.11 


0 


.03 


-0.07 


-0. 


05 


-0.02 


Composite 


306 


-0.02 


0 


.04 


-0.17* 


-0. 


02 


0.01 



* Value is significant at the a priori alpha level (0.05). 



In studies where grade level is specified, how do students 
exposed to new science curricula compare to students in 
traditional science courses? 

Data are presented in Table 20 which address the question of 
_ curriculum effectiveness by grade level. Conventional grade level 

groupings (i.e., elementary (K-6), junior high ( 7-9) ^ high school 
(10-12), and post secondary) are used for two reasons: (1) specific 
grade level data are not available in the majority of studies reported, 
and (2) the limited numbers of studies in many of the specific grades 
prohibits meaningful quantitative Synthesis. p 

erJc lO'J 



f 



107 



TABLE 20 

EFFECT SIZE DATA' FOR CRITERION CLUSTERS BY GRADE 
LEVEL ACROSS ALL NEW SCIENCE CURRICULA 



CLUSTER j N MEAN A ^ ^ S.D. t-value 

■ 

Elementary (K-6) 



Achievement 


27 


0. 


37 


-0. 


57 


2. 


47 


0. 


74 ..." 


*~2. 


64* 


Perceptions 


29 


0. 


28 


-0. 


82 


0. 


83 


0. 


46 


* 3." 


28* 


Process Skills 


16 


0. 


56 


-0. 


02 


2. 


50 


0. 


59 * 


3. 


84* 


Anaxync bKills 


X 


A 


ac 
Ub 


0 . 


06 


A 

0 . 


AC 

Ob 


0. 


00 






Related Skills 


37 


0. 


17 


-0. 


41 


1. 


04 


0. 


27 


3. 


84* 


Other Areas 


14 


0. 


32 


-0. 


55 


1. 


50 


,0. 


55 


2. 


22* 


Composite 


124 


0. 


31 


-0. 


82 


2. 


50 


. 0:- 


52 

• 


6. 


55* 


Junior High (7-9) 
























V 

Achievement 


13 


0. 


23 


-0. 


33 


0. 


86 


0. 


34 


2. 


46* 


Perceptions 


11 •> 


0. 


59 


0. 


17 


1. 


12 


0. 


31 


6. 


14* 


Process Skills 


18 


0. 


23 


-0. 


,62 


0. 


73 


0. 


39 


2. 


49* 


Analytic Skills 


14 


A 


02 


-u . 


ob 


A 




0. 


23 


A 

0 . 


OO 


Related Skills 


9 


0. 


68 


-0. 


10 


4. 


48 


1. 


46 


1. 


'■41 


Other Areas 


7 


0. 


33 


-0. 


70 


0. 


74 


0. 


48 


1. 


84 


Composite 


72 


0. 


31 


-0. 


70 


4.. 


48 


0. 


62 


4. 


24* 


High School (10-12) 














• 










Achievement 


83 


0. 


37 


-1. 


04 


4. 


18 


0. 


80 


4. 


29* 


Perceptions 


9 


0. 


44 


-0. 


81 


1. 


75 


0. 


70 


1. 


90 


Process Skills 


19 


0. 


43 


-0. 


33 


, 2. 


45 


0. 


68 


2. 


77* 


Analytic Skills 


19 


0. 


42 


-0. 


18 


1. 


44 


0. 


50 


3. 


66*' 


Related Skills 


2 


-0. 


23 . 


-0. 


50 


0. 


04 


0. 


38 


-0. 


85 


Composite 


132 


0. 


38 


-1. 


04 


4. 


18 


0.73 


6. 


06* 



continued 



108 



TABLE 20 (Continued) 





M 

N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S.D. 


t-value 


Post-Secondary 


















■ Achievement 


5 


0 


.47 


0. 


13 


0.82 


0.27* 


3.91* 


Perceptions 


1 


0 


.1-5 


0. 


15 


0.15 


0.00 




Process Skills 


2. 


< 0 


.09 


0. 


08 


0.10 


6.01 


9.00 


Analytic Skills 


1 


0 


.23 


: o. 


23 


0.23 


0.00 




Composite 


9 


0 


.32 


. 0. 


08 


0.82 


0.25 


3.70* 



-Value is significant at the a priori alpha level (0.05), 



A rough estimate of new curriculum effectiveness by grade level 
is available in the "composite" line entries of Table 20. The data 
for all the different criterion variables are treated as one composite 
variable, i.e., student performance. The data show that students 
participating in new science curricula performed better than their 
traditional course counterparts by 0.31 to 0.38 standard deviations 
across all performance measures. Thus, the average student in new 
science curricula (by grade level) exceeded the performance of 62-65%^ 
of the students in traditional courses. 

The detailed data in Table 20 show the effect sizes by criterion 
clusters. Data were available to\calculate approximately 125 effect 
sizes for both elementary grade level and high school studies but 
only 72 for the junior high school level. Similarly, data were 
available for 11 postrsecondary of fact size calculations. These 



ERLC 



Hi 



109 



post-secondary effect size calculations were for study situations in ° 
which no modifications wer^ made to the curricula being" used and those 
students had not had a previous course in that science discipline . 
Thus, reasonable comparisons are able to be made. 

Among the more interesting results in Table 20 are the significant 
differences forjprocess skills at the elementary level, perceptions 
at the junior High school level, and achievement at the post-secondary 
level. The process skill area was targeted as a critical area among 
elementary school science curriculum developers. Based on the Table 
20 data, that goal is being achieved. At the junior jhigh level 
student attitudes were considered a prime target (i.e., getting 
students to like science). Here again, the new programs show their 
effectiveness. 

What curriculum designers had not expected was the success of 
some programs in the post-secondary arena. The data show that when 
new science materials are used with junior college and beginning 
college students, achievement is enhanced. 

When grouped by gender j how do students exposed to new 
science curricula compare to students in traditional 
science courses? 

A recurring question in science education deals with the sex-bias . 
of certain science materials and even entire subject areas. There is 
an intuitive notion among lay people and educators that males 
gravitate toward selected science areas on a random basis but, that 
women tend to be attracted to non-quantitative areas of science, if 

* 1 1 j 



110 



they are attracted toward science at all. Following the intuitive 
logic one step further, the reason cited for female aversion to science 
is their poor performance in science related areas. The question of 
sex-bias in the new curriculum materials is not one that this meta- 
analysis can answer with the type of data available. However, there 
are some data that deal with the question of student gender and per- 
formance in the new science curricula compared to the traditional 
^courses. The data are presented in Table 21 and shown graphically in 
Figure 2. 

The data in Table 21 are grouped according to the make-up of the 
student populations sampled in the research studies coded. If the 
percentage females was reported as less than 25% of the total 
sample studied, the study was classified as predominantly male. t If 
the percentage of females was reported as greater than 75%, the study 
was classified as predominantly female. Male/female percentages 
between 25% and 75% were classified as a mixed group. For the com- 
posite data of Table 21 the performance criteria were collapsed across 
all performance factors to provide a gross indicator of science 
curriculum-student gender interaction. 

The breakdown of the data in Table 21 by sample type is intu- 
itively interesting. Only 19 effect sizes were calculated for samples 
with mox*e than 75% females while 123 were calculated for predominantly 
male samples. The composite performance results, however, show 
clearly that predominantly male and predominantly fen ale samples 
performed equally 'well, about a quarter standard deviation better than 



Ho 



Ill 



TABLE 21 

EFFECT SIZE DATA FOR CRITERION VARIABLES BY STUDENT 
' GENDER ACROSS ALL NEW SCIENCE CURRICULA 



CLUSTER 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S .1). 


t-value 


riaJLe bampJLe 














Achievement 


58 


0.25 


-1.04 


2 70 


U • ny 




Perceptions 


12 


-0.02 


-0.82 


0.76 


0 Sfi 


— V • XO 


Process Skills 


18 


0*16 


-0.44 


1.19 




JL • 3D 


Analytic Skills 


21 


0.30 


-0.18 


1 41 


0 uo 


O iHO" 


Related Skills 


8 


0.01 


-0.22 


0 28 


U . It 


u, oo 


Other Areas 


6 


0.47 


-0.09 


0 8fi 


0 3Q 


^ • yo" 


Composite 


1 23 


n oo 


_ i nil 


2 . 70 


0.45 


5.31* 


Mixed Sample 














Achievement 


68 


0.45 


-0.57 


4 18 


u . oo 


H • ZO" 


Perceptions 


34 


0.51 


-0.32 


1. 75 


O 3fl 
u . oo 


7 7 Rife 


Process Skills 


33 


0.52 


-0.62 


2 .50 




li »ccA 
H • DO" 


Analytic Skills 


9 


0.31 


-0.27 


1.44 


u . o o 


1 £7 
JL. O / 


Related Skills 


40 


0.30 


-0.50 


4.48 


0.7** 


2.55* 


Other Areas 


15 


0.27 


-0.70 


1.50 


0.56 


1.88 


Composite 


199 


0.43 


-0.70 


4.48 


0.71 


8.44* 


Female Sample 














Achievement 


4 


0.55 


-0.52 


1.65 


0.88 


1.25 


Perceptions 


5 


0.32 


-0.40 


0.82 


0.45 


1.58 


Process Skills 


5 


0.29 


-0.05 


0.52 


0.23 


2.80* 


Analytic Skills 


5 


-0.10 


-0.36 


0.21 


0.24 


-0.94 


Composite 


19 


0.25 


-0.52 


1.65 


0.50 


2.15* 


*Value is significant at 


the a priori alpha level (0.05). 





ERIC / II 



113 



their traditional course comparison groups. What is not easily 
explained is the substantially greater mean effect size for the mixed 
groups. Perhaps there is a social dimension to learning and liking 
science that must be accounted for in the classroom. 

The breakdown of the data in Table 21 reveals two interesting 
features. The first deals with the consistently higher effect size 
pattern of the mixed sample studies. On almost every criterion measure 
the mixed group samples produced more positive differences. The 
second feature deals with the analytic skills data for the female 
samples. T\\e -0.10 effect size represents tjhe largest negative result 
encountered in the new science curricula-traditional course compari- 
sons. 

An ANOVA was conducted on the effect size data grouped by indi- 
vidual performance criterion cluster and by composite performance by 
gender. The difference in overall student performance between the 
predominantly male samples and the mixed samples was statistically 
significant on the basis of a Duncan's Multiple Range Test, at the 
alpha 0.05 level. However, no significant differences between the , 
groupings were found for the individual criterion clusters. A summary 
of the ANOVA data appears in Table 22. 

In studies where school type is specified* how do 
students exposed to new science curricula compare to 
students in traditional science courses? 

Effect size data for student performance in new science programs 
versus traditional courses grouped by school type are presented in 



lit/ 



114 



TABLE 22 

ANOVA SUMMARY FOR EFFECT SIZE DATA GROUPED BY 
CRITERION CLUSTER AND SAMPLE GENDER 

SOURCE df SS MS F-value 



Model 


15 


7. 


90 


0. 


53 


1. 


35 * 


Cluster 


5 




14 


0. 


22 


0. 


59 


Gender 


2 


3. 


.19 


1. 


59 


4. 


.09* 


Cluster'^Gender 


8 


3. 


.09 


1. 


54 


0. 


99 


Error 


325 


126. 


.91 


0. 


39 







Corrected Total 340 



*Value is significant at the a priori alpha level (0.05). 



Table 23, Since information regarding the school type from which 
samples were drawn was not available in all studies reviewed, the 
dumber of effect sizes included in the analysis is substantially 

reduced from previous analyses. Assuming the uncodable studies would 

\ 

disburse themselves equally among the three school type categories (a 

i { 

conservative estimate), at is clear that the bulk of research con- 
ducted on questions of curriculum effectiveness is done in suburban 
schools. 

The composite data in Table 23 indicate that the new science 
curricula apparently impacted suburban and urban' schools more posi- 
tively than the rural schools. The breakdown of effect size data by 
criterion clusters magnify this composite data disparity. 



115 



TABLE 23 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY SCHOOL 
TYPE ACROSS ALL NEW SCIENCE CURRICULA 



CLUSTER 



MEAN A 



MINI- 
MUM A 



MAXI- 
MUM A 



S.D. 



Rural 



t-value 



Achievement 


9 


0.34 


0. 


04 


0.71 


0.25 . 


3 


.94* 


Perceptions 


9 


-0.07 


-0. 


82 


0.82 


0.58 


-0 


.40 


Process Skills •' 


6 


0.45 


0. 


15 


0 7^ 




h 

*T 




Other Areas ; 


1 


0.06 


0. 


06 


0.06 


0.00 






Composite 


25 


0.20 


-0. 


82 


0.82 


0 .44 


0 




Suburban 


















Achievement 


72 


0.41 


-1. 


04 


4.18 


0.85 


4 


.11* 


Perceptions 


19 


0.46 


0. 


00 


1.75 


0.40 


5 


.04* 


Process Skills 


13 


0.50 


-0. 


62 


2.45 


0.85 


2 


.11* 


Analytic Skills 


17 


0.27 


-0. 


27 


1.44 


0.45 


2 


.45* 


Related Skills 


34 


0.30 


-0. 


41 


4.48 


0.79 


2 


.22* 


Other Areas 


13 


0.37 


-0. 


55 


1.50 


0.56 


2 


.37* 


Composite 


168 


0.38 


-1. 


04 


4.48 


0.74 


6 


.72* 


Urban 


















Achievement 


4 


0.81 


0. 


20 


1.08 


0.41 


3 


.95* 


perceptions 


2 


0.64 


0. 


60 


0.69 


0.05 


14 


. 33* 


Process Skills 


12 


0.24 


-0. 


44 


1.19 


0.44 


1 


.92 


Analytic Skills 


11 


0.17 


-0. 


36 


1.41 


0.47 


1 


.19 


Related Skills 


2 


0.41 


0. 


08 


■0.75 


0.47 


1 


.24 


Other Areas 


1 


0.86 


0. 


86 


0.86 


0.00 




v 1 

Composite' V 


32 


0.34 


-0. 


44 


1..41 


0.47 


4 


.13* 



*Value is significant at the a priori alpha level (0.05). 



ERIC 



lid 



116- 



Specif ically, on achievement measures the rural school -mean is the 
lowest of the three groups though not substantially different from 
the suburban mean. However, on measures of student perceptions, the 
rural school data are more than a half standard deviation lower than 
both the suburban and urban groups. An ANOVA was performed on the 
effect size data to test the significance of the differences for each 
of the performance criterion groupings across the three school types. 
In essence a one-way ANOVA of effect size data is a test of the ' 
interaction of new curricula and school type on student performance. 
The differences were not statistically significant at the a < 0.05 
level. 

When grouped by socio-economic status, how do students 
exposed to new science eurrioula compare to studenvs in 
traditional science courses? 

The data in Table 24 indicate that very few studies of new science 
curricula have been conducted in which student socio-economic status 
was isolated as a study variable! The extremely small number of 
studies completed on low socio-economic student samples (N=4) 
diminishes the power of quantitative synthesis techniques considei^ably. 
The distribution of the 19 effect sizes calculated for the high SES 
samples also limits meaningful discussion regarding the achievement 
studies analysis at the criterion cluster level. Nonetheless, the high 
SES achievement data are interesting. The achievement (A = 1.00) and 
the composite data (A = 0.99) effect sizes for the high SES students 



117 



TABLE 24 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY STUDENT 
SOCIO-ECONOMIC STATUS ACROSS ALL NEW SCIENCE CURRICULA 



CLUSTER 


N 


MEAN A 


MINI- 

t MUM A 


MAXI- 
MUM A 


S.D. 


t-value 


Low SES 
Achievement \ 1 
Process Skills ' 1 
Related Skills 2 


1 fiR 

fi fill 
U • OH 

fi U 1 


— 
— 
0. 


— 
— 
08 


JL* Uo 


0.00 
0.00 

f\ It 1 

0.41 


1. 24 


Composite 


4 


" 0.63 


0. 


08 


1.08 


0.41 


3.06* 


Mid-SES 
















Achievement 


105 


0 27 


-0. 


57 


0 Ii7 
Z . H / 


fi HQ 


0 • Do" 


Perceptions 


49 


fi ^9 


-0. 


82 


JL ♦ JLz 


U • 44 


o . lb y « 


Process Skills 


49 


fi ^ 


-0. 


62 


z • OU 


fi It c 

U. 4b 


C AAA 

5. 00*» 


Analytic Skills 


33 


fi 9^ 


-0. 


36 


i it it 
JL. 44 


A It O 

0.42 


3.10" 


Related Skills 


46 


0.24 


-0. 


50 


4.48 


0.70 


2.38* 


Other Areas 


20 


0.31 


-0. 


70 


1.50 


0.52 


2.70* 


Composite 


302 


0.28 


-0.. 


82 


4.48 


0.51 


9 . 71* 


High SES 
















Achievement 


11 


1.00 


-0. 


26 


4.18 


1.59 


2.10* ■ 


Perceptions 


2 


1.40 


1. 


05 


1.75 


0.49 


4 £00 


Process Skills 


4 


1.00 


-0. 


33 


2.45 


1.31 


1.52 


Analytic Skills 


2 


0.50 


-0. 


08 


1.08 


0.82 


0,D6 


Composite 


19 


0.99 


-0. 


33 


4.18 


1.34 


3.21* 


*Value is significant at the a priori 


alpha level (0.05). 





9 

ERIC 



118 



are among the highest mean values for any analysis by group in this 
study. An ANOVA performed on the mean effect sizes for the composite 
performance data shows that the high SES composite effect size mean 
is significantly different from the mid-SES group, on the basis of a 
Duncan Multiple Range Test, at the alpha 0.05 level. The ANOVA 
summary is presented in Table 25. 



9 

ERLC 



TABLE 25 

ANOVA SUMMARY FOR EFFECT SIZE DATA GROUPED BY 

t 

CRITERION CLUSTER AND SOCIO-ECONOMIC STATUS 



SOURCE 


df 


SS 


MS 


F-value 


Model 


12 


10.72 


0.89 


2.51* 


Cluster 


5 


0.57 


0.11 


0.32 


SES 


2 


5.31 


2.66 


7.48* 


Cluster*SES 


5 


0.89 


0.18 


0.50 


Error 


312 








Corrected Total 


324 









*Value is significant at the a priori alpha level (0.05). 

Vfhen teachers receive inseroice training with particular 
new science curriculum* how do students exposed to the 
science program compare to students in traditional courses? 
The inservice data in Table 26 are extremely interesting in light 
of the debates and discussions during the past 25 years regarding the 
cost-effectiveness of in-service teacher education. Unfortunately, 
the data for this analysis are incomplete, as detailed information 



" 119 



regarding inservicing was available only in about 30% of the studies 
coded. These studies yielded 126 effect sizes. A summary of effect 
size data by criterion cluster for studies not reporting the inservice 
backgrounds of participating teachers is also reported in Table 26 to 
facilitate a more meaningful discussion of the known data. 

An examination of the composite and criterion cluster effect 
size data reveals a striking difference in overall student performance. 
On every measure where data are available for the inservice versus 
no inservice summaries the effect sizes are higher for the no inservice 
studies. Even if one were to assume that the effects of inservice 
education would only result in improved achievement scores for 
students taking courses from such teachers, the no inservice effect 
size for achievement ;(A = 0.46) is considerably higher than the 
achievement effect size (A = 0*22) for students taking courses from 
teachers who did not receive such education. 

When teachers receive special* instruction in t)ie use of 

materials for a particular science curriculum or method 

prior to receiving teacher certification (preserviae 

instruction) 3 hoi) do students exposed to those science 

programs compare to students in traditional courses? 

When considering the inservice data of Table 26 it is difficult 

to know exactly how many, or what percentage, of the teachers may 

o 

have received inservice instruction in studies where that information 
was not reported. Based upon the years of experience data coded 
however, approximately 65% of the teachers participating in the 



5 120 



TABLE 26 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY INSERVICE 
EXPERIENCE ACROSS ALL 1 NEW SCIENCE CURRICULA 



CLUSTER 


N 


MEAN A 


MINI- 
MUM A 


MAXI- 
MUM A 


S.D. 


t-value 


Inservice 
















Achievement 


40 


0.22 


-0. 


52 


1.65 


0.41 


3.50* 


Perceptions 


19 


0.16 


-0. 


82 


0.82 


0.48 


1.47 


Process Skills 


27 


G.32 


-0. 


62 


2.50 


0.55 


3.02* 


Analytic Skills 


15 


0.07 


-0. 


36 


0.44 


0.22 


1.36 


Related Skills 


5 


0.12 


-0. 


22 


0. 57 






Other Areas 


6 


0.57 


-0. 


09 


1.50 


0.57 


2.47* 


Composite 


112 


0.23 


-0 . 


82 


2.50 


0.45 


5.44* 


No Inservice 




* 












Achievement 


9 


0.46 


-0. 


13 


0.92 


0. 39 


3.53* 


Perceptions 


2 


0. 64 


0. 


60 


0.69 


0.06 


14. 33* 


Process Skills 


1 


0. 32 


— 


•• 




0.00 




Analytic Skills 


2 


0.62 


0. 


49 


0. 75 


0.18 


4.77 


Composite 


14 


0.50 


u * 




0.92 


0.32 


5.72* 


Data Not Reported 
















Achievement 


81 


0.43 


-1. 


04 


4.18 


0.87 


4.46* 


Perceptions 


30 


0.48 


-0. 


81 


1.75 


0.47 


5.64* 


Process Skills 


28 


0.45 


-0. 


33 


2.45 


0.58 


4.13* 


Analytic Skills 


18 


0.34 


-0. 


27 


1.44 


0.54 


2 . 68* 


Related Skills 


43 


0.26 


-0. 


50 


4.48 


0.72 


2.43* 


Other Areas 


15 


0.23 


-0. 


70 


0.86 


0.47 


1.88 


Composite 


215 


0.38 


-1. 


04 


4.48 


0.71 


» 8.01* 



*Valuu is significant at the a priori alpha level (0.05). 



ERIC 



123 



121 



studies "not reporting' preservice background graduated prior to 1960. 
This rules out the possibility of any preservice training for such 
teachers. With this in mind, roughly 15% of the studies reported 
teachers receiving some form of preservice instruction. 

The data in Table 27, like the inservice data, form a pattern 
showing a less positive impact on student performance when teachers 
involved in the studies received preservice instruction. Possible 
reasons for the preservice and inservice performance data are 
discussed in the summary statements of this report. 

Study Characteristics 

When the level of internal validity is accounted for in 
studies meta~anaZyzedj how do students .exposed to new 
science curricula compare to students in traditional 
science courses? 

Critics of meta-analysis express concern regarding the problem 
of combining results from both "good 11 and "poor 11 studies. Certainly 
this is a valid criticism-, if the collective results of studies rated 
"good" are significantly different from those rated "poor." But, as 
Glass (1980) points out, "... if igood* and 'poor 1 studies do not 
differ greatly in their findings, a large data base (all studies 
regardless of quality) is much to be preferred over a small data base 
(only the ll good" studies). The larger data base dan be more readily 
subdivided to answer specific sub-questions that are inevitably 
provoked by the answers to the general questions . . ." (p. 286). 

eric 1?«± 



122 



TABLE 27 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY PRESERVICE 
EXPERIENCE ACROSS ALL NEW SCIENCE CURRICULA 

CLUSTER N MEAN A S.D. t- value 

MUM ti MUM ii 



Preservice 



Achievement 


18 


0.20 


-0. 


49 


1. 


09 


0. 


36 


2.35* 


Perceptions 


4 


0.3,9 


0. 


21 


0. 


66 


0. 


20 


3.79* 


Process Skills 


9 


0.22 


-0. 


62 


0. 


73 


0. 


41 


1.61 


Composite 


31 


0.23 


-0. 


62 


1. 


09 


0. 


36 


3.60* 


No Preservice 




















Achievement 


9 


U • oO 


-u . 


13 


u . 


OO 

92 


U . 




O OCi': 
4 . ZD" 


Perceptions 


T O 

12 


u • uy 


O 

-u . 




U * 


OO 


o 

U . 


oy 


U . DO 


Process SKills 


2 


1 111 
1. Hi 


o 

u . 


OO 


Z i 




X . 


OH 


-V on 


Analytic Skills^ 


"5 


0\27 


0. 


01 


0 


.75 


0. 


32 


1.90 


Related Skills 


1 


0.57 










0. 


loo 




Other. Areas 


1 


1. 50 










0 . 


00 




Composite 


30 


0.33 


-0. 


82 


2 


.50 


0. 


65 


2.82* 


Data Not Reported 




















Achievement 


103 


0.40 


-1. 


04 


4 


.18 


0. 


80 


5.15* 


Perceptions 


35 


0.46 


-0. 


81 


1 


.75 


0. 


44 


6.14* 


Process Skills 


45 


0.37 


-0. 


44 


2 


.45 


0. 


50 


4.98* 


Analytic Skills 


30 


0.24 


-0. 


36 


1 


.44 


0. 


46 


2.85* 


Related Skills 


47 


0.24 


-0. 


50 


4 


.48 


0. 


69 


2.43* 


Other Areas 


20 


0.27 


-0. 


7U 


0 


.86 


0. 


45 


2.68* 


Composite 


280 


0.35 


-1. 


04 


4 


.48 


0. 


65 


9.13* 



*Value is significant at the a priori alpha level (0.05). 



\ 



i?0 



123 

Studies included in this meta-analysis were rated on several 
design features • One of these features was the overall internal 
validity of the study which the coders rated as: (1) Low (intact and 
highly dissimilar groups), (2) Medittm (random samples or matched 
samples with some threats to internal validity), or (3) High (random 
samples with low mortality). Summary effect size data for the three 
rated levels of internal validity are presented in Table 28. 

An examination of the composite effect size means in Table 28 
reveals a range of 0.05 standard deviations between studies at the 
extremes of judged internal validity. A one-way ANOVA of the effect 
size data by rated internal validity, revealed no significant 
differences in the composite effect size means of the three validity 
rankings. A further analysis of effect size means for each criterion 
clustor by judged internal validity also revealed no significant 
differences.* It is safe to assume, therefore, that any conclusions 
based on sub-groupings of study results reported herein are not 
weakened by level of internal validity of the original research studies 
analyzed. 

When the type of criterion measure used is considered in 
the meta-analysis > how do students exposed to new science 
curricula compare to students in traditional courses? 
A particular threat to a study's internal validity concerns the 
instrumentation used in measuring the dependent variables, in this 
case, student performance. Unvalidated, experimenter-made tests pose 
a threat to validity because of the high risk of experimenter bias in 



9 

ERIC 



12o 



124 



TABLE 28 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY RATED 
LEVEL OF INTERNAL VALIDITY 



CLUSTER 



N 



MEAN A 



MINI- 
MUM A 



MAXI- 
MUM A 



S.D. 



Low Internal Validity 



t -value 



Achievement 


52 


0.33 


-0.52 


2.70 


0.67 


3.59* 


Perceptions 


17 


-0.51 


-0.32 


1.12 


0.35 


6.01* 


Process Skills 


10 


0.58 


-0.10 


2.50 


0.71 


2.59* 


Analytic Skills 


7 




n m 

U . U -L 


"J OO 


U .42 


1.76 


Related Skills 


16 


0.21 


-0.22 


0.75 


0.25 


3.39* 


Other Areas 


8 


0.16 


-0.55 


0.81 


0.53 


0.07 


Composite 


110 


0.35 


-0.55 


2. 70 


0.56 


6 UQ* 


Medium Internal Validity 












Achievement 


66 


0.40 


-1.04 


4.18 


0.84 


3.91* 


Perceptions 


30 


0.27 


-0.82 


1.75 


0.56 


2.66* 


Process Skills 


UO 


u . oo 


—u • Oz 


O ii c 
Z . HO 


U.ob 


3.83* 


Analytic Skills 


27 




— U tOO 


1 lili 
_L • HH 


r\ lie 
U . HO 


^ . 82'' 


Related Skills 


32 


0.27 


-0.50 


4.48 


0.83 


1.86 


Other Areas 


10 


0.37 


-0.70 


0.86 


0.42 


2.79* 


Composite * 


205 


0.33 


-1.04 


4.48 


0.68 


6.87* 


High Internal Validity 












Achievement 


12 


0.35 


-0.02 


0.86 ' 


0.28 


4.29* 


Perceptions 




0.47 


0.37 


0.55 


0.07 


12.37* 


Process Skills 


5 


0.35 


0.26 


0.63 


0.15 


5.03* 


Analytic Skills 


1 


-0.07 






0.00 




Other Areas 


3 


0.61 


0.06 


1.50 


0.77 


1.38 


Composite 


25 


0.38 


-0.07 


1.50 


0.33 


5.86* 



*Value is significant at the a priori alpha level (0.05). 



125 

the construction or selection of test items. While there is no test 
which is totally unbiased, a conservative approach to the resolution 
of the test bias question in a research synthesis study is to segregate 
those studies using standardized tests for closer scrutiny. The 
results of the meta-analysis on student performance data by standard- 
ized test versus other forms is presented in Table 29. 

The data in Table 29 indicate that student performance results 
do not appear to be influenced by the type of test used. No signifi- 
cant differences were revealed in the comparison of composite per- 
formance data nor on criterion cluster data between standardized and 
other test forms. 

When length of treatment is isolated as a factor, how do 
students exposed to new science curricula compare to 
students in traditional science cover ses? 

Campbell and Stanley (1966) define internal validity as — ,f the 
basic minimum without which any experiment is uninterpretable" (p. 5>. 
One of the major threats to a study 1 s internal validity deals with 
treatment fidelity; in other words, are the treatment conditions which 
characterize the comparison groups discernable and reasonable. Few 
studies of new science curricula reported information from which 
treatment fidelity could be judged. However, considering the nature 
of the treatment condition of interest in this meta-analysis (i.e., 
student exposure to new science curricula versus traditional programs), 
the length of exposure to the programs constitutes a reasonable 
approximation to the question of treatment fidelity. 



12 d 



126 



TABLE 29 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY TYPE 
OF TEST USED 

CRITERION N MEAN A jgg^ jjjjjj^ S.D. t-value 



Standardized 



Achievement 


73 


0. 


35 


-1. 


04 


4.18 


0.76 


3 . 94* 


Perceptions 


34 


0. 


34 


-0. 


82 


1.75 


0.49 


3.97* 


Process Skills 


oo * 
00 


u . 


00 


-U . 






U • JO 


3 


AUd JLj Liu O i\ -L J — Lo 


35 


0. 


24 


-0. 


36 


1.44 


0.44 


3.29* 


Related Skills 


31 


0. 


23 


-0. 


41 


4.48 


0.81 


1.58 


Other Areas 


7 


0. 


59 


-0. 


09 


1.50 


0.50 


3.12* 


Composite 


218 


0. 


32 


-1 


.04 


4.48 


0.64 


7.35* 


3th er Forms 


















Achievement 


57 


0 


39 


-0 


.52 


2.70 


0.69 


4.26* 


Perceptions 


17 


0, 


.43 


-0 


.81 


1.12 


0.48 


3.71* 


Process Skills 


18 


0 


.50 


-0 


.17 


2.50 


0.62 


3.44* 


Related Skills 


17 


0 


.29 


-0 


.50 


1.04 


0.38 


3.17* 


Other Areas 


14 


0 


.19 


-0 


.70 


0.86 


0.48 


1.52 


Composite 


123 


0 


.37 


-0 


.81 


* 2.70 


0.59 


7.01* 



*Value is significant at the a priori alpha level (0.05). 



Table 30 contains summary effect size data for studies grouped by 
the length of study. Four levels of treatment duration were chosen 
for analysis: (1) less than 10 weeks 9 (2) between 10 and 20 weeks, 
(3) between 21 and 36 weeks, and (4) longer than 36 weeks. 



I2j 



127 



TABLE 30 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY 
LENGTH OF TREATMENT 



CRITERION 



N 



MEAN A 



MINI- 
MUM A 



MAXI- 
MUM A 



S.D. 



Less than 10 Weeks 



t- value 



Achievement 


6 


0 


.55 ' 


-0.49 


2 


.47 


1.01 


1.34 


Perceptions 


2 


0 


.43 


0.35 


0 


.51 


0.11 


5.38 


Analytic Skills 


1 


0 


.01 





— 


— 


0.00 





Related Skills 


' 1 


0, 


34 








0.00 




Other Areas 


6 


o. 


05 




u , 


B 1 

oX 


U . Oo 


0.27 




JLD 


n 


o.n 

OU 


-U . OO 


2. 


47 


0.70 


1.73 


Between 10 and 20 


Weeks 
















Achievement 


14 


0. 


22 


-0.03 


0. 


92 


0.29 


2.86* 


Perceptions 


19 


0. 


21 


-0.82 


1. 


12 


0.58 


1.59 


Process Skills 


11 


0. 


37 


0.08 


0. 


70 


0.17 


6.98* 


Analytic Skills 


3 


0. 


43 


0.06 


0. 


75 


0.34 


2.15 


Related Skills 


12 


0. 


49 


-0.10 


4. 


48 


1.27 


1.34 


Other Areas 


2 


0. 


89 


0.29 


1. 


50 


0.85 


1.48 


Composite 


61 


0. 


33 


-0.82 


4. 


48 


0.68 


3.82* 


Between 21 and 36 


Weeks 
















Achievement 


94 


0. 


39 


-1.04 


4. 


18 


0.79 


4.80* 


Perceptions 


23 


0. 


46 


-0.81 


1. 


75 


0.47 


4.75* 


Process Skills 


43 


0. 


40 


-0.62 


2. 


50 


0.63 


4.12* 


Analytic Skills 


30 


0. 


23 


-0.36 


1. 


44 


0.46 


2.76* 


Related Skills 


21 


0. 


24 


-0.50 


1. 


04 


0.33 


3.32* 


Other Areas 


6 


0. 


19 


-0.70 


0. 


49 


0.45 


1.07 


Composite 


217 


0. 


36 


-1.04 


4. 


18 


0.65 


8.16* 



continued 



9 

ERIC 



128 



TABLE 30 (Continued) 



CRITERION N MEAN A ^ ^XI- ^ 



Longer than 36 Weeks 



Achievement 


14 


0.32 


-0.57 


1.65 


0. 


57 


2 


.11* 


Perceptions 


6 


0.50 


0.21 


0.76 


0. 


19 


6 


.29* 


Process Skills 


1 


0.32 






0. 


00 






Related Skills 


12 


0.00 


-0.41 


0.33 


0. 


25 


0 


.02 


Other Areas 


5 


0.40 


-0.09 


0.74 


0. 


35 


2 


.55 


Composite 


38 


0.26 


-0.57 


1.65 


0. 


43 


3 


.69* 



*Value is significant at the a priori alpha level (0.05). 

The data in Table 30 show -that length of treatment appears to 
have no effect upon the composite performance data. The values range 
from A= 0.26 for studies longer than 36 weeks to A= 0.36 for studies 
between 21 and 36 weeks long. An ANOVA of both the composite per- 
formance data and performance cluster data revealed no significant 
differences in mean effect size values. It is interesting to note, 
however, that composite performance effect size data show a pattern 
of positive increases for studies spanning periods less than 10 weeks 
up to 36 weeks and a decline for studies conducted across two school 
years (greater than 36 weeks). This regression effect of composite 
student performance after 36 weeks 1 exposure contrasted to the 
stabilization, even strengthening of the perceptions data for treat- 
ments beyond 36 weeks (A = 0.50) suggests that new science curricula 
may have been most effective in changing student attitudes. 



129 



.^J^m^jB^udim^^g^uped by the form of publication^ how 
do students exposed to new science curricula compare to 
students in traditional science courses? 

Whether a research report is published, or where it is published, 
does not constitute a threat to the validity of a study nor should it 
be considered a source of bias. But, there is a prevailing notion 
among some researchers and practitioners that only significant results 
are publishable. In coding the results and characteristics of studies 
included in this meta-analysis of new science curricula effects , the 
primary source of the study was coded. In the event a study was 
published in two forms (e.g., a dissertation and a journal article), 
the original source of the data was recorded (in this example, the 
dissertation). Summary effect size data for student performance 
variables grouped by source, or form of publication, are presented 
in Table 31. 

The data in Table 31 reveal no major differences in composite 
performance effect size when grouped according to publication form. 
In fact, the pattern of more pronounced effect size values for the 
unpublished materials contradicts the "only significant results are 
published 11 argument. It is interesting to note though, that the 
lion's share of research on this topic has boen completed by graduate 
student researchers completing doctoral dissertations. 



130 



TABLE 31 

EFFECT SIZE DATA FOR CRITERION CLUSTERS BY 
FORM OF PUBLICATION 

CRITERION N MEAN A "^7 S.D. t-value 
MUM A MUM A 



Journal Articles 

Achievement 26 0.36 -1.04 2.70 0.81 2.27* 

Perceptions 10 0.44 0.05- 0.85 0.25 5.49* 

Process Skills 10 0.39 0.08 0.70 0.23 5.23* 

Analytic Skills 2 0.47 -0.27 1.22 1.05 0.64 

Related Skills 11 -0.08 -0.41 0.28 0.19 -1.45 

Other Areas 4 0.41 -0.09 0.86 0.44 1.86 

Composite 63 0.30 - -1.04 2.70 0.59 4.11* 

t 

Dissertations /Theses 

Achievement 97 0.39 -0.52 4.18 0.73 5.24* 

Perceptions 34 0.36 —0.82 1.75 0.53 3.95* 

Process Skills 40 0.42 -0.62 2.50 0.64 4.15* 

Analytic Skills 32 0.23 -0.36 1.44 0.42 3.18* 

Related Skills 25 0.20 , -0.50 0.75 0.23 4.40* 

Other Areas 15 0.28 -0.70 1.50 0.58 1.94 

Composite 243 0.34 -0.82 4.18 0.61 8.84* 

Unpublished Material ^ 

Achievement 7 0.14 -0.46 0.54 0.43 0.86 

Perceptions 7 0.29 -0.81 0.76 0.54 1.43 

Process Skills 6 0.14 -0.17 0.30 0.20 1.79 

Analytic Skills 1 0.15 0.00 

Related Skills 12 0.66 -0.01 4.48 1.25 1.84 

Other Areas 2 0.53 0.25 0.81 0.39 1.89 

Composite -35 0.37 -0.81 4.48 0.81 2.74*. 



*Value is significant at the a priori alpha level (0.05). 



131 



SUMMARY 

Literally dozens of interesting questions come to mind when the 
issue of curriculum effectiveness is raised. Numerous factors enter 
into the interpretation of data regarding even the most straightforward 
question. In the case of the new science curricula developed in the 
post-Sputnik years of the sixties and seventies, numerous studies were 
completed in which student performance in new science programs was 
compared to student performance in traditional courses. The results 
of any one study regarding the impact of a particular program are a 
matter of record. The collective results of the multiple studies con- 
ducted on several of the curricula are not so numerous. Moreover, 
these reports t^^to be qualitative summaries which lack credibility 
and engender little or no confidence in the field. 

The criticisms of qualitative research integration techniques are 
well-knovm. " Jackson (1978) summarizes these criticisms as follows: 

(1) Reviewers often ignore previous reviews on the same or 
similar topic. 

(2) Reviewers often run the risk of sampling errors by 
selecting non-representative subsets of existing 
literature . 

(3) Reviewers often use inappropriate representations of 
study results (e.g., whether or not results were 
statistically significant). 

(H) Reviewers often fail to recognize and account for 
study characteristics which might affect results 



132 



(e.g., study sample, treatment fidelity, testing 
procedures) . 

(5) Reviewers often report so little about their review 
procedures it is difficult to judge the validity of 
the conclusions. 

In qxi effort to tease out of the literature a concentrated mass 
of summative data regarding the comparison of student performance in 
new and traditional curricula and to avoid the pitfalls of research 
integration, a quantitative analysis, of experimental and quasi- 
experimental results from the retrievable literature was performed. 
The quantitative integration technique used is referred to as meta- 
analysis (Glass, 1976). Conclusions in this report are based on data 
from 105 studies deemed suitable for quantitative integration from 
the pool of 302 studies identified. The 105 studies yielded informa- 
tion on 27 different new science curricula and 18 different student 
performance measures. Approximately 70 study characteristics were 
coded in reviewing each research report. Using the distribution of 
science curricula and student performance criteria and the collection 
of study characteristics, 15 major sub-questions were analyzed. A 
summary of results is presented below: 

It The average student exposed to new science curricula 
exceeded the performance of 65% of the students in 
traditional science courses on the aggregate criterion 
variable. 

136 

o 

ERIC 



II. The effects of new science curricula on student per- 
formance were most impressive for the following 
performance criteria: creativity, laboratory tech- 
niques, attitude toward specific subject and science, 
and general" achievement, The only negative effect 
size calculated for students exposed to new 
science curricula was for student self-concept 
(A = -0.08). 

III. Student overall performance scores were found to be 

significantly more positive for mixed student samples 
than with either female or male groups among the ne^w 
curricula studied. 

IV. Student overall performance scores were found to be 
significantly more positive for both high and low 
socio-economic students samples than for the mid- 
range socio-economic groups. 
V. Student overall performance scores were found to be 
significantly more positive for student samples 
attending either urban or suburban schools than for 
rural school students among the new curricula studied. 

VI. New science curricula appear to have been most 
effective in enhancing student process skill 
development at the elementary school level. 



ERIC 



13, 



' 134 

0 

VII. New science curricula appear to have been most effec- 
tive in changing student attitudes at the junior high 
school level. *s 
VIII. New science curricula appear to have been most effec- 
tive in enhancing student analytic skills at the high 
school level. 

IX. New science curricula appear to have been most effec- 
tive in enhancing student achievement at the post- 
secondary, secondary and elementary grade levels. 
X. New science curricula emphasizing inquiry, process, 
laboratory and individualization were observed to 
adversely affect student perceptions about their 
experiences in the programs. 
XI. New science curricula emphasizing process skill 
development were observed to adversely affect 
student analytic thinking skills. 
XII. New science curricula emphasizing laboratory activity 
were observed to adversely affect overall student 
performance. 

XIII. Of the major science curricula studied, BSCS-Blue, 
BSCS-Yellow, and PSSC exhibited the best overall 
student performance record. 
XIV. Of the new curricula studied, PSSC and BSCS-Blue were 
found to be most effective in enhancing student 
achievement. 



135 



XV. Of the new curricula studied, BSOS-Blue, BSCS-Green, 
HSP, SCIL, and IS were found to be most effective in 
enhancing student perceptions. 
XVI. Of the new curricula studied, BSCS-Blue, SCIS, ESS, and 
. S-APA were found to be most effective in enhancing 
student process skills. 
XVII. Of the new curricula studied, BSCS-Blue, PSSC and CHEM 
Study were found to be most effective in enhancing 
student analytic thinking skills. 
XVIII. Of the new curricula studied, USMES was found to be most 
effective in enhancing student skills in related areas. 
XIX. Student overall performance scores were observed to be 
considerably lower when teachers reported having 
received either inservice or preservice training in 
the use of the program. 

The 19 summary statements must be examined carefully not only 
in light of the data presented here and the detail of the primary 
research on which the analyses were based, but also in light of what 
was not reported here or in the primary research. Perhaps the 
greatest uncertainty lies in the original treatment conditions them- 
selves: was the PSSC, CHEM Study, ... , SCIS program roally being 
implemented according to the philosophy of the curriculum or were the 
materials just being used? In some studies, could the traditional 
treatment actually have been more of a "new curriculum" because of 
some innovative teacher methodologies than the new curriculum to which 



0 



136 

it was being compared,? This broad question of treatment fidelity and 
verification is a critical issue which defies resolution but certainly 
can confound and distort conclusions. 

For example, summary Statements X, XI, and XII dealing with cur- 
riculum emphasis on inquiry, laboratory, process, afid individualization 
and their relationship to overall student performance are prime can- 
didates for the distorted data file. Recall the ratings on these 
parameters were made by a panel, of science educators . Would the 
teachers using the various curricula rate them similarly? Even if they 
rated them similarly, would they implement the curricula with these 
same emphases? Surely we all have seen or heard of a teacher lecturing 
about the inquiry method! 

Similarly, the results of the inservice and preservice analyses 
(summary statement XIX) require careful examination. What proportion 
of the typical NSF inservice program was spent on learning alxmt the 
new curriculum and what proportion was spent in organic chemistry, 
parisitology , or quantum mechanics? Should inservice and preservice 
programs be abandoned or written pff based on these data, or, should 
there be a resolve to revamp the programs and change the emphasis? 



137 



BIBLIOGRAPHY 



9 

ERIC 



140 



138 



BIBLIOGRAPHY 



Barr, A, J., Goodnight, J. H., Sail, J. P. and Helwig, J. T. A user's 
r.ui de to SAS-76 . Raleigh, North Carolina: Sparks Press, 1976. 

Brown, B. B. and Webb, J. N. Valid and reliable observations of 
classroom behavior. Classroom Interaction Newsletter , 1968, 
4, 35-38. 

Butts, D. P., Bybee, R. W., Gallagher, J. J. and Yager, R. E. 
• Assessing the current status of science education. In R. 

Yager, Crisis in science education. Science Education Center . 
Technical Report #21 . Iowa City, Iowa: The University of Iowa, 
1980. 

Collette, A. T. Science teaching in the secondary school: A guide 
for modernizing instruction . Boston: Allyn and Bacon, Inc., 
1973. 



Conantj J. B. The comprehensive high school: A second report to 
interested citizens . New York: McGraw-Hill, 1967. 

Gallagher, J. J. A summary of research in science education for the 
years 1968-1969: Elementary school level. Journal of Research 
in Science Teaching , 1972, £, 19-^6. 

Glass, G. V. Primary, secondary and meta-analysis of research. 
Educational Researcher , 1976, 5_, 3-8. 

Haney, R. E. The changing curriculum: Science . Washington, D.C.: 
National Education Association, 1966. 

Haney, R. et al. A summary of research in science education for the 
years 1965-1967: Elementary school level. Research review 
series - science, paper 2. Columbus: ERIC Information Analysis 
Center for Science and Mathematics Education, 1969* (ERIC 
Document Reproduction Service No. ED 038 554) 

Harms, N. C. and Yager, R. E. What research says to the science 
teacher (Volume 3). Washington, D.C.: National Science 
Teachers Association, 1981. 

Helgeson, S. L., Blosser, P. L. and Howe, R. W. Science educatio n 
(Volume 1). The status of pre-college science, mathematics', 
and social science education: 1955-1975 . Washington, D.C. : 
U.S. Government Printing Office, 1978. 



ERJC 



139 



"fielwig, "DV'T. aria Council, TC. A. TEcTs. ) SSS" user r s guicie : 1379 
edition , Cary, North Carolina: SAS Institute, Inc., 1979. 

Hurd, P. D. Biological education in American secondary schools 

1890-1960 . Washington, D.C.: American Institute of Biological 
Sciences, 1961. 

Jackson, 6. B. Methods for integrative reviews. Review of Educa- 
tional Research , 1980, ,50 (3), 438-460. 

Jacobson, W. J. Approaches to science education research: Analysis 
and criticism. Journal of Research^ in Science Teaching , 1970, 
7, 217-225. 

Kahl, S. and Harms, N. Project synthesis: ^Purpose, organization, 
and procedures. In N. C. Harms and R. E. Yager (Eds,), What 
research says to the science teacher (Volume 3). Washington, 
D.C.: National Science Teachers Association, 1981. 

Klopfer, L. Evaluation of learning in science. In B. S. Bloom, J. T. 
Hastings and 6. E. Madans (Eds.), Handbook of formative and 
summative evaluation . New York: McGraw-Hill, 1971. % ' 

Lacey, A. L. Guide to science teaching in secondary schools . 
Belmont, California: Wadsworth Publishing Company, 1966. 

National Assessment of educational Progress. Science: Second 

assessment (1972-73): Changes in science performance, 1969-73, 
with exercize volume and appendix (April 1977); OU-S-21, 
Science technical report: Summary volume (May 1977). Science: 
Third assessment (1976-1977) : 08-S-OU, Three national assess- 
ments of science: Changes in achievement, 1969-77 (June 1978): 
08-S-08, The third assessment of science, 1976-77 . Released 
exercize set (May 1978 ). Also some unpublished data from the 
1976-77 Science Assessment, Denver, Colorado: 1870 Lincoln 
Street. 

National Science Foundation. What are the needs in precollege 

science, mathematics, and social science education? Views from 
the field . Washington, D.C. : National Science Foundation, 
1980. 

Novak, J. D. A case study of curriculum change. School Science and 
Mathematics , 1969, 69, 375-384. 

Ramsey, G. A. and Howe, R. W. An analysis of research on instruc- 
tional procedures in secondary school science. Part I: Outcomes 
of instruction. The Science Teacher, 1969(a), 36 (3), 62-76. 



ERIC 14- 



140 



Ramsey, G77t. anaubwe, "An "analyslV of "research on instruc- ' 

tional procedures in secondary school science. Part II: 
Instructional procedures. The Science Teacher , 1969(b), 36 (4), 



Richardson, J. S. Science teaching in secondary schools . Englewood 
Cliffs, New Jersey: Prentice-Hall, Inc., 1964. 

Rutherford, F. J. Preface. In NSF Report SE 80-9, What are the 
needs in precollege science, mathematics, and social science 
education? Views from the field . Washington, D.C.: National 
Science Foundation, 1980. 

Schlessinger, F. R. and Helgeson, S. L. National programs in science 
and" mathematics education. School Science and Mathe matics, 1969 
69, 633-643. " = 

Schwab, J. J. Biology teachers 1 handbook . New York: Wiley, 1963. 

Sdience Policy Research Division. The National Science Foundation 
and pre-college science education: 1950-1975 . A report pre- 
pared for the subcommittee on science, research, and technology. 
Washington, D.C.: U.S. Government Printing Office, 1975. 

Stake, R. E. and Easley, J. Jr. Case studies in science education . 
Washington, D.C.: U.S. Government Printing Office, 1978. 

Thurber, W. A. and Collette, A. T. Teaching science in today f s secon- 
dary schools ( t 3rd ed. ). Boston: Allyn and Bacon, Inc., 1968. 

Washton, N. S. Teaching science creatively in the secondary schools . 
Philadelphia: W. B. Saunders Company, 1967. 

Weiss, I. R. Report of the 1977 national survey of science, mathe- 
matics, and social studies education . Washington, D.C.: U.S. 
Government Printing Office, 1978. 

Welch, W. W. The impact of national curriculum projects — The need 
for accurate assessment. School Science" and Mathematics , 1968, 
68, 225-234. ' 

Welch, W. W. Curriculum evaluation. Review of Educational R esearch, 
1969, 39, 429-443. ~ ~ 



Yager, R. E. Crisis in science education. Science Education Center 
Technical Report #21 . Iowa City, Iowa: The University of 
Iowa, 1980a. 



Yager, R. "E. "Status study of "graduate science education in the 
United States, 1960-80. Final report for NSF Grant Number 
79-SP-0698, The University of Iowa, Iowa City, Iowa, 1980b. 

Yager, R. E. Prologue. In N. C. Harms and R. E. Yager (Eds.), 

What research says to th e science teacher (Volume 3). .Washing- i 
ton, D.C.: National Science Teachers Association, 1981a. ' 

Yager, R. E. Analysis of current accomplishments and needs in science 
education. Unpublished report, Science Education Center, The 
University of Iowa, 1981b. 

Yager, R. E. The current situation in science education, (in press) 



142 



INSTRUCTIONAL SYSTEMS IH 
SCIENCE EDUCATION 



John B. Willett 
June J. M. Yamas.hita*' 

School of Education 
Stanford University 



*The order of the co-authors has been determined alphabetically. 



ERIC 



14 



.143 



9 

ERLC 



ON BECOMING META-ANALYTI CALLY LITERATE 
(THOUGH PENNILESS) 



As reported designs proved distressing 
Reflections in Glass were a blessing 
When coding a t 
MS, r or p 

Or rampant covariance guessing. 

Though, as savants of the random statistic, 

We ! ve furthered the "cause anal itique," 

To earn bread and but ter 

1 Ti s better, (than meta-) 

To be a jongleur or auto-mechani que , 



144 



'TABLE OF CONTENTS 



I. Setting Up the Meta-Analysis ' 147 

Introduction and Definition of Terms 148 

Evolving the Coding Form 157 

• Variables Included on the Coding Form 158 

Variables Generated Prior to the Analysis .... 170 

VaViables Recoded Prior to the Analysis 171 

II. Coding the Data 173 

Sampling and Coding 173 

( Sampling Restrictions 175 

III. Analyzing the Data 178 

Breakdown of All Effect Sizes Over All Studies. . 178 

i 

Mean Effect Size, System by System 186 

Group Sizes, System by System 189 

Summary Data for the Individual Systems 191 

Mean Effect Sizes Broken Down by 

Selected Variables Across Systems 253 

IV. Studies Included in this Report 273 

V. References , 285 



0 

ERIC 



145 

* LIST OF TABLES 

Tab1e Pajge 

1 Mean Effect Size by Year of Publication 179 

2 Mean Effect Size by Form of Publication 180 

3 Mean Effect Size by Grade 181 

4 Mean Effect Size by Assignment to Groups 182 

- -- 5 Mean-Effect Si7e~by Subjedr Matter-.- . . . 183 

6 Mean Effect Size by Type of Outcome Criterion 183 

7 Mean Effect Size by-Origin of Instrument Used 184 

8 Mean Effect Size by Method Used' to Calculate Effect Size ... 184 
9' Mean Effect Size by the Means 

Used in the Effect Size Calculation . 185 

10 Mean Effect Size, System by System 187 

11 Final Size of Treatment Ooup Within Each System ....... 189 

12 Final Size of Control Group Within Each System 190 

13 Audio-Tutorial System 193 

14 Computer-Linked Systems 196 

15 Computer Assisted Instruction , , . . . 199 

16 Computer Managed Instruction 202 • 

17 Computer Simulated Experiments 203 

18 Contracts for Learning 206 

19 Departmentalized Elementary School 208 

20 Individualized Instruction 212 

21 Mastery Learning 216 

22 Media-Based Systems 220 

23 Television Instruction 224 

4 



Table Page 

24 Film Based Instruction * 228 

25 Personalized System of Instruction f 232~~ 

26 Programmed Instruction 235 

27 Branched Programmed Instruction 237 

28 Linear Programmed Instruction f 240 

29 Self-Directed Study C Y. . . . ."~244~ 

30 Source Papers 247 

31 Student Assisted Instructional Systems 249 

32 Team Teaching 252 

33 Effect Sizes by Form of Reporting for Each System 254 

34 Effect Sizes by Grade Level of Subjects for Each System. . . . 256 

35 Effect Sizes by Validity of Design for each System 258 

36 Effect Sizes by Subject Matter for Each System 260 

37 Effect Sizes by Immediate 

or Retention Measures for Each System 262 

38 Effect Sizes by Type of Outcome Criteria for Each System . . . 263 

39 Effect Sizes by 

Method of Measurement for Each System 265 

40 Effect Sizes by Calculation of Effect Size for Each System . . 267 

41 Effect Sizes by Source of Means for Each System 2 69 



0 

ERIC 



U'J 



147 



'0 



f 



SECTION I: SETTING UP THE META-ANALYSIS 



148 



SETTING UP THE META-ANALYSIS 

Introduction and Definition of Terms 

The Stanford group was assigned the question: "What are the effects of 
different instructional systems used in science teaching?" It was necessary 
initially to clarify the meaning of "systems" in order to provide as complete 
an analysis as possible white avoiding any overlap with the analysis per- 
formed by other study centers. The following definition was provided by the 
steering committee: 

An instructional system is a general plan for conducting a 
course over an extended period of time. It is general in that 
it often encompasses many aspects of a course (e.g., presentation 
of content, testing, size of study groups). Examples of instruc- 
tional systems are: mastery learning, competency-based instruc- 
tion, programmed instruction, modular instruction, mini-courses, 
ability grouping, team teaching, departmentalized vs. self- 
contained, diagnostic-prescriptive instruction, independent 
study/projects, computer-managed or computer assisted instruc- 
tion, audio-tutorial. 

An earlier draft had stated that instructional systems are "usually 
evaluated in an actual classroom as opposed to being evaluated in a labora- 
tory, 11 and "typically involve the comparison of a new learning approach with 
traditional instruction. 11 

On the basis of such defiaitions an initial list of systems was pro- 
vided at the training session in October 1980, and included those listed 
above. Subsequent refinements led to the designation of certain systems on 
the original list as "methods" or "techniques" rather than "systems," and 
these were then reallocated to other study centers* 

The following systems were covered by the Stanford group and are 
reported here: 



ERIC lr 



149 



(1) Audio-Tutorial ^ 

(2) Computer-Linked, also reported separately in three categories: 

(a) Computer Assisted Instruction (CAI-) 

(b) Computer Managed Instruction (CMI) 

(c) Computer Simulated Experiments (CSE) 

(3) Contracts for Learning 

(4) c Departmentalized Elementary School 

(5) Individualized Instruction 

(6) Mastery Learning 

(7) Media-Based Instruction, also reported separately as: 

(a) Film Instruction 

(b) Television Instruction 

(8) Personalized System of Instruction (Keller PSI) 

(9) Programmed Learning 

(a) Branched Programmed Learning 

(b) Linear Programmed Learning 

(10) Self-Directed Study 

(11) Use of Original Source Papers in the Teaching of Science 

(12) Team Teaching 



Each of the systems included in this report will be briefly discussed. 

(1) .Audio-Tutorial System . Good (1973:50) defined the Audio-Tutorial 
System as !l a self-pacing multimedia system of instruction that features 
tape recorded lessons with kits of learning materials and instruction sheets 
for individual learning in study carrels." Descriptions given in studies 
evaluated here which purported to investigate this system were consistent 
with the above definition. Frequently, the method was referred to on the 



college level e:: "Postlethwaite's Audio-Tutorial System, 11 

Dr. S. N. Postl ethwaite first used this system in a freshman botany 

class in 1961 at Purdue University, and described it as: "Audio programming 

of learning experiences . • . includes lectures, reading of text or other 

appropriate material, making observations on demonstration set-ups, doing 

experiments, watching movies and/or any other appropriate activities helpful 

in understanding the subject matter" (Postlethwaite, Novak, & Murray, 1964:6). 

Audio-tutorial lessons may incorporate behavioral objectives, learning 

for mastery, self-pacing, and multi-media activities. Audiotapes are used 

"to pace students through integrated laboratory, lecture, discussion, and 

demonstration activities" (Nordland et al., 1972:673). (However, many 

studies which were coded here failed to report the exact constitution of 

their program, preferring simply to assume that such details were implicit 

in the label "Audio-tutorial instruction,") v * 

As in other forms of individually-paced instruction, audio-tutorial 

systems purport to use learning time more efficiently and effectively: 

The crucial variable in [comparing A-T and traditional 
instruction] is not whether students under one instructional 
approach acquire more knowledge than under the other instruc- 
tional approach but rather the analysis of learning time 
required to reach a given level of attainment and the quality 
of subsuming concepts acquired in the process [Novak 1970:782]. 

However, in the studies coded here, this question was largely ignored and 

consecjuentl y no summary on use of time is included in this report. 

(2) Computer-Linked Systems . This category was created during the 

data analysis by consolidating effect sizes obtained under the next three 

headings. 

(a) Computer Assisted Instruction (CAI) pertains to the use of 
the computer as a teaching machine. Good (1973:589) defined teaching 



151 



machines as devices which ". . . control the material to which the 
student has access at any moment, preventing him from looking ahead or 
reviewing old items; • . • contain a response mechanism, that is, . . • 
a keyboard or selection buttons; some provision is made for knowledge 
of results ...;... score responses and tabulate errors. 11 : These 
tutoring programs sometimes (but not always) provide for student choice 
in content, sequencing, or type of instruction. The claimed superiority 
of CAI to conventional teaching derives from its supposed potential for 
providing immediate feedback to the student on each response and offer- 
ing appropriate remediation in a manner that is often 
not feasible in the traditional classroom. 

That not very many studies were found using computers with pre- 
college classes is not surprising—a recent National Science Foundation 
study CWeiss 1978:19) reported that only 9% of science classes in 
grades 10-12 ever use computers or computer terminals, although 36% 
of high schools have them, indicating that computers are used more for 
mathematics classes or for administrative purposes than for science 
instruction. 

(b) Computer Managed Instruction (CMI), on the other hand, does 
not provide actual instruction for the student. Instead, the computer 
may be used to generate tests for students based on specific objectives, 
making random selections from a pool of items; to keep an up-to-date 
record of each student's progress in meeting learning objectives; to 
prescribe additional learning or remediation tasks; to plan inter- 
actively an individual student's route through pre-stored curricula, 
and so forth. 



152 



{c\ Computer Simulated Experiments (CSE) have much potential in 
science instruction but are found in only a few studies at the pre- 
college level (Hartley, 1976:69-70). These simulations allow the stu- 
dent to operate with a simpler system than would actually be present 
in the laboratory, with unimportant or extraneous factors eliminated 

v 

in the computer program. In addition, simulations allow for a wiaer 
range of student explorations in areas that would, in reality, be too 
dangerous, too time-consuming, or too costly. 

(.3) Contracts for Learning are established between an individual stu- 
dent and the teacher, and include the content, activities, deadlines, and 
methods of evaluation. Contracts would generally be a component of Self- 
Directed Study or other forms of independent study, A 1977 study for the 
National Science Foundation found that 78% of science classes nev^r use 
contracts (.Melton 1980:126)— which may indicate only that the remaining 22% 
of science classes use contracts occasionally or seldom, or for specific 
aspects of a course. 

(4) Departmentalized Elementary School refers to the teaching of 
elementary school science by a Specialist rather than by the typically 
generalist teacher. The specialist would ordinarily have a greater degree 
of academic training in the particular aspect of the science taught. 

(5) Individualized Instruction subsumes several of the other areas 
described in this section, and is a catch-all term for many different 
approaches. In many cases, the experimental intervention used by the 
studies included under this system was labelled "individualized" when all 
students studied the same learning materials in the same sequence and their 
learning was evaluated in the same way; the only difference was in their 
pacing, with students allowed to proceed "at their own rate," using 



153 



individual packets of learning material. In contrast, Ramsey and Howe 0969; 
73) offered a much broader description: 

Individualized instruction attempts to provide a complete instruc- 
tional program designed explicitly for each individual, taking 
into account his background experience, interests, and ability. 

Individualized instruction may also have been coded as Audio-Tutorial, 
Computer Assisted Instruction, Contracts, PSI, Programmed Instruction, or 
Self-Directed Study as appropriate* 

Marchese (1977:699), in a literature search of individualized instruc- 
tion in science, found that although much research had been conducted in 
other fields, very little had been reported in science. He also questioned 
the adequacy of instructional materials prepared for studies on individual- 
ized instruction (1 977 :701 ), and Herring et al . (1974:11 ) suggested an 
interaction of learning materials with methods of instruction may exist. 

\ 

(6) Mastery Learning , as presented by Bloom in 1968, defined mastery 
in terms of behavioral objectives, with class instruction supplemented with 
feedback/correction mechanisms (Block 1971:7-8). Tests on unit objectives 
are followed by supplementary instruction on objectives not attained, and 
the student is retested until a pre-selected mastery level is achieved. 
Because specific levels of attainment are specified, the important variable 
in mastery learning is the time required to reach those levels; however, 

in the studies coded here this variable was largely ignored and thus appro- 
priate conclusions cannot be reported. 

(7) Media-Based Instruction: Television and Film . Studies coded as 
either television instruction or film instruction are those in which these 
forms of media provide the primary instruction rather than supplements to 
classroom teaching. Several television instruction studies were evaluations 
of a series of televised lessons which were prepared or presented at the 



154 



State Department of Education level. Many of the studies coded on film 
instruction used the "Harvey White Films 11 for Physics. 

Slides and audio tapes were included on the coding sheet, and resulting 
effect sizes are incorporated into "Media-Based Instruction," but these two 
categories are not reported separately since very few effect sizes were 
obtained for them. 

(8) Personalized System of Instruction (PSI). Frequently referred to 
on the college level as the "Keller Plan," PSI generally consists of the 
following features (Carmichael , 1976:791-2): self-paced; learning materials 
divided into small modules, each of which must be mastered before going on 
to the next;, students used as graders and tutors; lack of reliance on live 
lectures, with printed materials being the primary form of communication. 
PSI has been widely criticized for the absence of the "motivating factor" 
that can com£ from live lectures and contact with the instructor (Palladino, 
1979:323; Emerson, 1975:228; Kuska, 1976:505). As with other systems, the 
results of studies of this method of instruction may often be confounded 
with the value of the instructional materials specifically prepared for the 
investigation. A detailed study guide for each student is a crucial factor 
(Smith, 1976:51 0; Novak, 1974:15), which may not have been provided in 
every case. 

PSI is most likely to be found at the college level, and few studies 
have been found at lower levels. 

(9) Programmed Learning . Schramm, frequently cited for his leader- 
ship in programmed instruction, summarized what he called the essential 
characteristics of programmed instruction of the Skinnerian type (1962:99): 

a) an ordered sequence of stimulus items, 

b) to each of which a student responds in some specified way, 

c) his responses being reinforced by immediate knowledge of results, 



\ 



ERIC 




155 



d) so that he moves in small steps, 

e) therefore making few errors and practicing mostly correct responses, v 

f) from which he moves, by a process of successively closer approxi- 
mations, toward what he is supposed to learn from theprogram. 

In studies comparing programmed and conventional instruction, Silberman 
(1962:19) noted (as was also observed here) that "conditions of conventional 
instruction are seldom described in such reports." In fact, this lack was 
evident in many studies in other categories as well, implying that the 
salient features of conventional or traditional teaching are well known. 

A 1977 National Science Foundation study reported that 71% of all 
science classes never use programmed instruction (Melton, 1980:126). In all 
prpbability, the remaining 29% use this form of instruction for short units. 
In studies which explored teachers 1 and students 1 affective responses to 
programmed instruction (such as the Fund for the Advancement of Education's 
[1964] Four Case Studies in Programed Instruction ), a frequent comment was 
that students became bored with the materials. Teachers who intended to 
continue use of these materials beyond the experimental period were those 
who tended to use them along with other classroom activities, for remedia- 
tion or enrichment, or as aids to classroom instruction rather than as a 
replacement. Short 'programmed units were found most .useful when incorpor- 
ated into a planned sequence of classroom activities. 

Although a study may be coded as a comparison between a certain system 
of instruction and conventional teaching, a major part of what is being 
tested may be the value of the treatment protocol. A doctoral candidate 
using a- self-developed package would be testing not only the efficacy of 
the instructional approach but also of the materials themselves. 

Studies on programmed learning were coded as "linear 11 or "branched," 
but only five effect sizes were obtained for the latter. The small number 



9 

ERLC 



1 



5c 



156 

of studies using branching is probably a result of the greater difficulty in 
developing such programs* since branching provides for the student to be 
"routed through one or more remedial sequences of frames if he misses a 
question or skipped ahead if he evidences mastery of content in a sequence" 
CGood 1973:70). 

(10) Sel f-Directed Study . This strategy usually includes the features 
described as "Contracts for Learning," with students being principally res- 
ponsible for "directing" their own study. However, in the studies we 
reviewed, students were somewhat restricted in their "self-direction": they 
might have a choice in the order in which they studied various units, and 
sometimes in the methods in which they studied the units and were evaluated, 
but were unlikely to have carte blanche "across the board," 

(11) Source Papers . This system of teaching is based principally on 
the use of selected original scientific papers, documents, books, etc., 
rather than on the use of a school textbook. A course based on the use of 
source papers involves students in the finding, reading and interpretation 
of these original documents with or without guidance from the teacher. 

(12) Team Teaching is "a type of instructional organization involving 
teaching personnel and the students assigned to them, in which two or more 
teachers are given joint responsibility for all or a significant part of the 
instruction of the same group of students" (Good 1973:590). As utilized in 
most cases, teachers shared the responsibility of large group lectures while 
being individually responsible for their assigned small or medium-sized 
groups. Teachers may alternate in presenting the large group lectures, or 
one teacher may be judged by the team as a superior lecturer and thus will 
make all presentations while the others play supporting roles and continue 
to handle individually their small groups. Team planning, tn which those 

t r > 



157 

persons teaching the same subject in a school participate, by itself does 
not constitute team teaching; the "joint responsibility" mentioned above is 
the crucial factor. In many cases, the studies reported here as investiga- 
tions of the team teaching system failed to delineate what proportion of 
the total classroom time was spent in large group lecture, small group 
discussion, tutorial and so forth, Howe ; ver, for a particular study to have 
been coded as "team teaching," it was regarded as sufficient for the inves- 
tigator to have labelled it thus. 



Evolving the Coding Form 

Jackson (1978) recommended that, in conducting integrative reviews, 
previous reviews on the same or similar topics be consulted prior to samp- 
ling and coding. In the case of the current meta-analysis, this step was 
performed by the steering committee. Then, on the basis of this consulta- 
tion, a draft coding sheet was produced in Colorado. During the training 
session and ensuing weeks, emphasis was placed on speed in coding studies 
rather than on the evaluation of the instrument itself, although some modi- 
fications were made to the coding sheets during the early stages of the 
coding. 

^It may have been more appropriate, however, to involve the research 
assistants in the initial review, as it would have enabled them to construct 
a more coherent coding sheet. For example, it would have then been possible 
to identify the most prominent features of different forms of individualized 
instruction, say, and it could have been decided that every study of a sys- 
tem "A" would consistently have a particular group of variables coded "yes"; 



158 

or, conversely, that coding a certain feature "yes" implies certain other 
characteristics that are not included on the coding sheet. Since this was 
not the case, the coding sheets evolved in a more restricted manner. Once 
coding had begun, major changes on the coding sheet were difficult to make 
due to the lack of availability of previously coded studies which were 
returned to Colorado for circulation to other centers. 

Previous reviews would not, however, have illustrated the lack of util- 
ity of many items on the coding sheet— that few studies, for example, describe 
in concise terms the school community and socio-economic status of the 
groups, the size of the school, or student characteristics apart from IQ or 
other standardized ability measures; characteristics of the participating 
teachers, such as age, years of teaching, educational background, or even 
sex; and some characteristics of the experimental procedure, such as length 
of each lesson, class size, or initial size of the experimental groups. A 
great deal of unproductive time was spent in scanning reports, looking for 
information on these often-omitted variables; the desire to fill in as many 
blanks as possible on the coding sheet was somewhat compulsive, and omission 
of those items would have decreased the coding time per study considerably. 
Since it was not possible prior to coding many studies to identify these 
often-omitted variables, some decisions were made midway through the coding 
that there would be no intensive search for information that was likely to 
be missing. 

The final coding sheet (the fourth version) consisted of the following 
eleven sections, each with a number of coding variables: 

(1) Identification of the Study 

(2) Student Identification (Treatment group; control group) 

(3) Context Characteristics (Treatment group; control group) 



159 



ERIC 



(4) Teacher Characteristics (Treatment group; control group) 

(5) Design Characteristics (Treatment group; control group) 

(6) Treatment Characteristics (Treatment group; control group) 

(7) Features (Treatment group; control group) 

(8) Group Structure (Treatment group; control group) 

(9) Materials (Treatment group; control group) 

(10) Outcome Characteristics 

(11) Effect Size Calculation. 

When it was not possible to code a particular variable, the column(.s) 
was(were) left blank; in the computer analysis, blanks were given the value 
of -9 to distinguish them from variables (if any) which were coded 0. 

The variables coded are presented on the following pages • A column has 
been added to the right side of each page, noting the number of coding sheets 
(out of 341) which did ruvt include information on the given variable. Some 
of the variables were of only incidental interest, but some were hoped even- 
tually to yield interesting sub-analyses (e.g., the mortality of subjects: 
initial size minus final size of treatment and control groups). 

Some variables which were not included on the coding sheet may have 
possibly yielded other relationships of interest—for example, whether the 
investigator (mostly in the case of dissertations) was the teacher for both 
the treatment and control groups, as was the case in several studies, or 
whether the teachers were unaffiliated with the designing of the study; 
whether the treatment and control groups were from the same school, differ- 
ent schools in the same district, or different districts; whether the same 
teachers taught both treatment and control groups; whether the study was 
conducted over different years on the same population, with the base year 
being the control condition and the later year being the experimental (as \w 
the case of several studies). 



160 



Variables Included on the Coding Form 



Card 


Cols. 


Variable 
Name 


1 


3-6 


STUDY 


j 7-8 


COMP 


| 9-10 


OUTCOME 


j 11-14 


YEAR 


! 15 


FORM 


2 


1-2 


- 

SAGE1 


! 3-4 


GRADE1 


! 5-7 


IQ1 


! 8 


SIQ1 


! 9 


h£qi 


' ! 10-12 


SSEX1 


' I 13-15 




! 16 

i 


SPMIN1 




i 
i 

17-19 i 


SPPMIN1 




20 


SES1 




21 1 


HSES1 



Data missing* 
(out of 341) 
No. % 

Study identification code o 0 

Comparison code 0 0 

Outcome code 0 0 

Year in vhich study was reported 0 0 

Form in which study was reported 0 0 

1. Journal article 

2 . Book 

3. Masters thesis 

4. Doctoral thesis 

5» Unpublished article 

6. Conference paper 

Mean age of students in treatment group 4 1% 

Modal grade of treatment group 3 1% 

Average IQ of treatment group 122 37% 

Source of treatment group IQ 127 37% 

1. Stated 

2 ♦ Inferred 

Homogeneity of treatment group IQ 135 40% 

1; Homogeneous 

2. . Heterogeneous 

Percent female in treatment group 260 76% 

Tercent minority in treatment group 322 94% 

Predominant minority in treatment group 321 94% 

1. Mexican 4 # Native American 

2. Other Hispanic 5. Black 

3. Asian 6. Other 

Percent predominant minority in 331 97% 

treatment group 

Mean socioeconomic status of treatment 

group 252 74% 

1 . Low 

2 • Medium 

3. High 

Homogeneity of treatment group SES 255 75% 

1* Homogeneous 
2 . Heterogeneous 



^"Missing 11 indicates that information on the specific variable could not be 
found in a report, or (as on Cards 6 through 9) that the coder had no basis 
for inferring that some feature either was or was not included. 



ERJC 



n 



16. 



.161 



Card 



ERIC 



Cols, j 



22 



24-26 i 
27-29 i 



30 



I 



i 



1-2 



3-4 | 
5-6 | 
| 

• 7-8 | 

i 
l 

! 9-10 I 

I 

11-13 ! 



14-16 ! 



17 



Variable 
Name 



HAND1 



23 i GR0UP1 

i 



NSBEG1 
NSEND1 
SIZ1 



I 

31 ! C0MM1 



Treatment group handicap, if any 

1. Vision impaired 

2. Hearing impaired 

3. Learning disabled 

4. Emotionally disturbed 

5. Multiple handicaps 

6. Other 

Treatment group tracking 

1. Not grouped 

2. Low track 

3. Medium track 
'4. High track 

Initial size of treatment group 

Final size of treatment group 

School size of treatment group 

1. Less than 50 

2. 50 to 199 

3. 200 to 499 

4. 500 to 999 

5. 1000 to 2000 

6. More than 2000 

Community type of treatment group 

1. Urban 

2. Rural 

3. Suburban 



Data missing 
(out of 341) 
No. % 

(Deleted) 



316 93 



1QC 55% 
13 4% 
263 77% 



112 33% 



ON CARD 3, COLUMNS 1-31 CONTAIN THE SAME INFORMATION ON 
CONTROL GROUP THAT CARD 2 DOES ON THE TREATMENT GROUP. 
CARD 3, THE VARIABLE NAMES END WITH 2 INSTEAD OF 1 
(e.g., C0MM2) . 



THE 
ON 



NTEACH1 

TAGE1 

NEXP1 

NSCI1 

NCURR1 

TSEX1 

TRAC1 

TPMIN1 



Number of teachers in treatment group 

Mean teacher age in treatment group 

Treatment group teachers, average 
number of years of teaching 

Average number of years of science 
teaching 

Average number of years teaching 
this curriculum 



39 
292 

272 

302 

328 

273 

326 



Percent female teachers in treatment 
group 

Percent minority i-.eachers in ^treatment 
group 

Predominant minority of treatment group 334 
teachers 

1. Mexican 4. Native American 

2. Other 'Hispanic 5. Black 

3. Asian 6. Other 



11% 
86% 

80% 

89% 

96% 

80% 

96% 
98% 



162 



Card 





/ 



! Variable 



i 

Cols.} Name 



18-20} TPPMIN1 
j 

21 j TBACK1 



I 



22 



23 



25 



26 



27 



1 



Percent predominant minority teachers 
in treatment group 

Educational background of treatment 
group teachers 

1. Less than B.A. 
B.A. only 
B.A. + 15 units 
M.A. only 
M.A. + 15 units 
M.A. + 30 units 



Data missing 
(out of 341) 
No. % 



334 98% 



291 85% 



2. 
3. 
4. 
5. 
6. 
7. 



Doctorate 



TPSERV1 



TNSF1 



Treatment group teacher inservice 
training prior to experiment 

1. Low; one-shot 

2. Medium: series of lectures 
or workshops 

3. Specialization 

Training through N.S.F.? 

1. Yes 

2. No 



284 83% 



24 TUNIVl^ ^ Training obtained at university? 
! * * 1. Yes 

! 2. No 



303. 89% 



288 84% 



TL0CAL1 



ACCEPT1 



298 87% 



264 77% 



SASSJ 



! 28 * TASS1 



29 J VALID1 

i 
i 



Training obtained locally? 
1. Yes 
# i: No 

Treatment group teachers 1 acceptance 
of philosophy 

1. Low 

2. Medium 

3. High 

Assignment of students to treatment group, 0 0% 

1. Stratified random * 

2. Random 

3. Matched 

4. Intact random 

5. Intact nonrandom 

6. Self-selected 

Assignment of teachers to treatment group 6 2% 

1 . Random 

2 . Nonrandom 

3. Self-selected 

4. Crossed * 

5. Matched 

Treatment group rated internal validity 77 23% 
2. Low (intact, highly dissimilar) 

2. Medium (random or intact, some threat) 

3. High (random, low mortality) 



ERLC 



163 



Card 



Cols, 



30 



31 



Variable 
Name 

UNIT1 



TYPE1 



Treatment group unit of analysis 

1. Individual 

2. Classroom subgroup 

3. Classroom 
4*- School 

5. Other 

Type of study 

1. Correlational 

2. Quasi-experimental 

3. Experimental 



Data misting 
(out of 341) 
No, % 

4 17. 



1% 



ON CARD 5, COLUMNS 1-31 CONTAIN THE SAME INFORMATION ON THE 
CONTROL GROUP THAT CARD 4 DOES ON THE TREATMENT GROUP • ON 
CARD 5, THE VARIABLE NAMES END WITH 2 INSTEAD OF 1. 



SUBMA1 



ERJC 



2-3 

4-5 
6-8 
9-10! 
11 



12 



13 



14 



15 



16 



DUEATN1 

WEEKS1 
TIME1 
FREQ1 
■ FIDCUR1 



FIDTRE1 



SUPINT1 



BEH0BJ1 



SELPAC1 



IMFEED1 



Subject matter in treatment group 0 0% 

1. General science 5. Earth science' 

2. Life science 6. Chemistry 

3. Physical science 7. Physics 

4. Biology 8. Other 

Duration of treatment group program 13 4% 

in weeks 



Time elapsed prior to testing, in weeks 
Minutes per week of treatment 
Frequency of testing, times per month 
Treatment group fidelity to curriculum 



1. 
2. 
3. 



Low 
Medium 
High 



18 
44 
323 
288 



5% 
13% 

95% 
84% 



Fidelity to treatment 274 80% 

1. Low 

2 . Medium 

3. High 

Nature of implementation 16 5% 

1. Supplemental 

2. Integral 

Behavioral objectives in treatment group 69 20% 

1. Used 

2. Not used 

Self paced in treatment group 2 1% 

1. Used 

2. Not used 

Immediate feedback in treatment group 51 15% 

1. Used 

2. Not used 



164 



Card 



Cols. 



17 



18 



19 



20 



21 



22 



23 



24 



25 



26 



27 



28 



Variable 
Name 

DIATEST1 



CAI1 



Data "missing 
(out of 341) 
No. % 

Diagnostic testing and prescription 54 16% 

in treatment group 

1. Used 

2. Not used 

Computer assisted instruction in 2 1% 

treatment group 

1. Used 

2. Not used 

Computer managed instruction in 2 1% 

treatment group 

1. Used 

2. Not used 

Computer simulated experiments in 2 1% 

treatment group 

1. Used 

2. Not used 

Team teaching in treatment group 2 1% 

1. Used 

2. Not used 

Teacher as tutor in treatment group 18 5% 

1. Used 

2. Not used 

Pupil as tutor in treatment group 8 2% 

1 . I' .ed 

2. Not used 

Individualized instruction in treatment 2 1% 
group 

1 . Used 

2. Not used 

Unit approach to instruction in treatment 31 11% 
group 

1. Used 

2. Not used 

Departmentalized elementary school in 4-1% 
treatment group 

1. Used 

2. Not used 

Source papers in treatment group 2 1% 

1. Used 

2. Not used 

Traditional science classroom in 2 1% 

treatment group 

1. Used 

2. Not used 



ON CARD 7, COLUMNS 1-28 CONTAIN THE SAME INFORMATION ON THE 
CONTROL GROUP THAT CARD 6 DOES ON THE TREATMENT GROUP, 



CMI1 



CSE1 



TEAM! 



TTUT0R1 



PTUT0R1 



INDINS1 



UNITAPP-1 



DEPT1 



USES01 



TRAD1 



ERIC 



16 V 



165 



Card 



Cols . 

1-2 

3 



10 



11 



12 



13 



14 



15 



Variable 
Name 

CLASIZ1 p 
FLEXMOD1 



LARGE1 



MEDGRP1 



SMLGRP1 



SINGLE1 



LABACT1 



DEMOl 



STRLAB1 



UNSTR1 



MATER1 



KITS1 



LINPROl 



BRANCH1 



Data missing 
(out of 341) 
No, % 

Average class size in treatment group 39 11% 

Flexible modular scheduling in 2 1% 

treatment group 

1. Used * 

2. Not used 

Large group organization 10 3% 

1. Used 

2. Not used 

Normal class grouping in treatment group 9 3% 

1. Used 

2. Not used 

Small group organization 50 15% 

1. Used 

2. Not used 

Group of 1 student 59 17% 

1. Used 

2. Not used 

Laboratory activities in treatment group 65 19% 

1. Used 

2. Not used 

Teacher demonstrations in trtmt grp 147 43% 

1. Used 

2. Not used 

Student lab activities structured 125 37% 

in treatment group 

1. Used 

2. Not used 

Student lab activities unstructured 150 44% 

in treatment group • 

1. Used 

2. Not used 

Nature of treatment §roup learning 64 19% 

materials 

1. Published 

2. Modified published 

3. Original 

Learning kits in treatment group 109 32% 

1. Used 

2. Not used 

Linear programmed materials 9 3% 

1. Used 

2. Not used 

/ x 

Branched programmed -materials 9 37. 

1. Used 

2, Not used 



ERIC 

/ 



16 



166 



Card 



Cols . 



16 



17 



18 



19 



20 

21 

22-24 
25 

26 

27 

28 

29 



Variable 
Name 

GRREAD1 



SELFDIR1 



STASS1 



MEDBAS1 



Data missing 
(out of 341) 



No. 



BB0ARD1 

MASTREQ1 

MASTLEV1 
TDIRREM1 

SDIRREM1 

PSI1 

AUDTUT1 

C0NTRAC1 



Programmed materials graded by 
reading level in treatment group 
1. Used 
. 2. Not used 

Self-directed study 

1. Used 

2. Not used 

Student-assisted instructional program 

1. Used 

2. Not used 

Media-based instruction 

1. Television 

2. Not used 

3. Film 

4. Teaching machines 

5. Slides 

6. Tapes 

Victor electrowriter 

1. Used 

2. Not used 

Mastery learning 

1. Required 

2. Not required 

Level of mastery required 

Teacher-directed remediation 

1. Used 

2. t used 

Student-directed remediation 
1. Used 
2 j Not used 

Keller Personalized System of Instr. 

1. Used 

2. Not used 

Audio -Tutorial 

1. Used 

2. Not used 

Contracts for learning 

1. Used 

2. Not used 



14 4% 



6 2% 



1 0% 



1 0% 



2 1% 

151 44% 

335 98% 

150 44% 

150 44% 

152 45% 
159 47% 
190 56% 



ON CARD 9, COLUMNS 1-29 PROVIDE THE SAME INFORMATION ON THE 
CONTROL GROUP THAT CARD 8 DOES ON THE TREATMENT GROUP. 




167 



Card 



10 



Cols. 



1-2 



Variable 
Name 

TYPCRIT Type of 

1. 

2. 

3. 

4. 

5. 

6. 

7. 

8. 

9. 
10. 
11. 
12. 
13. 
14. 
15. 
16. 
17. 
. 18. 



7-8 



outcome criterion 
Cognitive low (recall, comprehension) 
Cognitive high (application) 
Cognitive mixed/general achievement 
Problem solving 
Affective toward subject 
Affective toward science 
Affective toward procedure/method 
Values 

Process skills 
Methods of science 
Psychomotor (lab skills) 
Critical thinking 
Creativity 
Decision making 
Logical thinking 
Spatial reasoning 
Self'-concept 
Science perceptions 



9 

ERIC 



C0NG1 Congruence of measure with "treatment program 
1# Low 

2. Medium 

3. High 

C0NG2 Congruence of measure with control program 

1 . Low 

2. Medium 

3. High 

METHMS Method of measurement (type of instrument) 

1. Published, nationally available, 
standardized 

2. Modification of national standardized 

3. Ad hoc written tests 

A. Classroom evaluation, excluding //1-3 

5. Observation (passive, unstructured) 

6. Structured interview, assessment 

7. Other 

REACT Reactivity of measure 

1. Low; cognitive measures, 1 administration 

or long lag, not alterable <, 

2. Medium 

v 3, High; affective, transparent, alterable 

SOURCE Calculation of effect size 

1. Directly from reported or raw data 

2. Reported with direct estimates (ANOVA, etc.) 

3. From frequencies reported on ordinal scales 
A, Backwards from other variances of means 

5. Nonparametrics (other than //3) 

6. Estimated from independent sources 

7. Estimated from variance (correlation guessing) 

8. Estimated from p-value 

9. From raw data with teacher (year) effects removed 

10. Other 

11. From percentiles 



168 



Card 



10 



Cols, 



10 



11 

12-15 
16 

17-20 

21-24 

25-28 

29-32 
33-36 



Variable 
Name 

SOMEANS 



SIGKIF 



DVUNITS 

GEU 
INDIV 

RATIO 

ESE 

ESC 

AVES 
STES 



Source of means 

!• Unadjusted posttest 

2. Covariance adjusted 

3. Residual gains 

4. Pre-post differences 

5. Other 

Reported significance 

1. p < .005 

2. .005 < p < .01 

3. .01 < p < .05; 

4. .05 < p < ,10 

5. p > .10 

6. "not significant" - 

Dependent variable units 

1 . Grade-equivalent 

2. Other 

Mean difference in grade equivalent units 

Group variances reported individually 

1. Yes 

2. No 

Ratio of treatment to control group standard 
deviation 

Effect size based on treatment group standard 
deviation 

Effect size based on control group standard 
deviation 

Average of ESE aiid ESC 
Study Effect Size 



ERIC 



17jl 



1-69 

SCIENCE PROJECT CODING SHEET - Ques. 2, Systems STANFORD 

Author 

Title 



CARD 8 



CARD 9 



CARD 10 



Measure Used 



CARD 1 
2 _ 

2 



CARD 2 



CARD 3 



CARD 4 



CARD 5 



CARD 6 



CARD 7 



1 9 



9 

ERIC 



170 



Variables Generated Prior to the Analysis 

Prior to the initiation of the analysis itself, several new variables 
were created from existing variables. These newly created variables are 
listed below. 

IMMEDES1 A variable to indicate whether the experimental group was 

evaluated within four weeks of the conclusion of the inter- 
vention or after that time. 

1. Immediate evaluation. 

2. Delayed evaluation. 

IMMEDES2 Contains similar information as IMMEDES1 but pertaining to 

the control group. 
ALLTIME1 A variable containing the total length of time in minutes of 

the experimental group program (i.e., the duration of the 

experimental group intervention). 
ALLTIME2 Contains similar information as ALLTIME1 but pertaining to 

the control group. 
VALDESN1 A variable to indicate the manner in which subjects were 

allocated to the experimental group. 

1. Allocation by stratified random or random sampling 

2. Allocation by matching subjects or by randomly allocating 
intact groups. 

3. Nonrandom allocation. 

VALDESN2 Contains similar information as VALDESN1 but pertaining to 
the control group. 



171 



Variables Receded Prior to the Analysis 

In several cases existing values of certain variables were modified and 
regrouped prior to the analysis. The new value labels are listed below. 

METHMS Method of measurement (type of instrument) 



1. Published, nationally available, standardized. 

2. Modified national standardized and ad hoc written tests 
(.previous values 2 and 3 taken together). 

4. All other types of evaluation (previous values 4, 5, 6 



1. All cognitive and problem-solving (previous values 1, 2, 

. 3 and 4 taken together). 
5. All affective (previous values 5, 6, 7 and 8 taken 
together, 

10. Science methods (previous values 10 and 18 taken together). 
(All other value labels remain the same.) 



1 . Directly from reported or raw data (same as previous 
value 1 ) . 

2. By direct calculation from reported statistics (previous 



and 7 taken together). 



TYPCRIT Type of outcome criterion 



SOURCE 



Calculation of effect size 



values 2 and 9 taken together). 



Less trustworthy methods of effect size estimation 



[previous values 3, 4, 5, 6, 7, 8, 10 and 11 taken 



together) . 



eric 




172 



SECTION II. CODING THE DATA 



ERIC 17 o 



173 



CODING THE DATA 

Sampling and Coding 

Coding of studies began with microfilmed dissertations sent 
from Colorado whose titles implied that they would fall into the 
appropriate domain of inquiry. 

Studies available through ERIC were identified directly by 
Colorado. Since dissertations are usually available in either 
microfilm or microfiche, this prior identification allowed the tedious 
task of going through thirty years of Dissertation Abstracts to be 
skipped by the coding center. Copies of abstracts of ERIC-available 
science studies facilitated the identification of studies in the 
system area. Five shipments of microfilms were coded; studies 
available on microfiche were obtained locally. 

Scanning the bibliographies of each study gave a file of 
possible leads, including journal articles, books, dissertations, 
and conference papers. Following up these references frequently 
disclosed that they did not pertain to science instruction; or they 
were descriptive rather than experimental; or they involved college 
/ students as subjects, and so forth. 
^s^^ Educational journals were scanned, volume by volume, from 1950 

(or later initial date in some instances) to the present time. Likely 
sounding titles in the tables of contents were followed up. These 
articles frequently were not relevant to the investigation; many 
described the same studies that had already been coded in dissertation form. 



174 



The following journals were examined during the above process: 
American Biology Teacher 
American Educational Review Journal 
Audiovisual Communication Review 

Bulletin, National Association of Secondary School Principals 

California Journal of Educational Research 

Journal of Chemical Education 

Journal of Computer Based Education 

Journal of Educational Psychology 

Journal of Educational Research 

Journal of Experimental Education 

Journal of Programmed Instruction 

Journal of Research on Science Teaching 

Harvard Educational Review 

School Science and Mathematics 

Science Education 

Science Teacher 

Dissertations were the source of 58.5% of the included studies; journal 
articles, 31 .5%; and unpublished studies, 10%. In many cases, it should be 
noted that a given study may have been reported several times (as a disser- 
tation, one or more journal articles and a conference paper). When this 
occurred the most complete reported version of the study was used as the 
basis of the coding performed here. 



ERIC 



17V 



175 



Sampling Restrictions 

Various restrictions and conventions were adopted to limit the range of 
/ the sample of studies coded in the meta-analysis. 

1. Age of Subjects . The meta-analysis was limited to studies using 
students in grades K through 12. As in many other areas of educational 
research, studies in science education are often conducted on college stu- 
.dents, who are most accessible to, researchers. The college setting, also 
provides some features that are not commonly found in elementary or second- 
ary schools, such as computers, teaching assistants, and open laboratories. 
The data here then include very little on computer-managed or computer- 
assisted instruction, computer simulated experiments, audio-tutorial systems, 
or the Keller Personalized System of Instruction (designed for use at the 
coll ege level ) . 

2. Geography . The investigations carried^out here were limited to 
studies reported in the United States. Doctoral dissertations were a major 
source of information, and American dissertations were the only ones readily 
available. Studies published in other countries, if included, would have 
been limited to those written in English and accessible in jburnal or book 
form, thus producing an incomplete international picture. 

3. Control Group Instruction . Only studies which used a control group 
taught in the "traditional" or "conventional" classroom manner were included. 
This restriction eliminated studies (particularly where the dependent vari- 
able was student attitude toward science) which included some form of science 
instruction for the treatment group and no science instruction at all for 
the control group. 

4. Year of Publication of Study . The year 1 950 was designated as the 
earliest date of publication for included studies. It "was expected that 



9 

ERIC 



1? 



176 



the bulk of science studies would have been conducted from the late 1^50s 
through the mid-1970s, the period of cjenerous governmental funding of the 
sciences. Examining the dates of the coded studies confirms the validity 
of this expectation* 

Of the more than 300 studies purporting to investigate "systems" which 
were^ considered and rejected, the Vol lowing reasons for rejection were 
documented: 

42% - subjects were .coll ege-aged 

33% - incomplete data, such as means but no other information, only 
interview data or nc^data* or levels of significance with no 
indication- of direction 

17% - no control group * 

6% - control group^ which were not taught "traditionally"— e.g. , 

comparing two levels of individualized instruction 
2% - subjects yere teachers rather than students. c * 



177 




SECTION III. ANALYZING THE DATA 



ANALYZING THE DATA 



All Effect Sizes Over All Studies 

Overall, a total of 341 effect sizes were generated in the Teaching 
Systems area of the current meta-analysis. The mean effect size produced 
oyer all systems was 0.103 with a standard deviation of 0.414, indicating 
that, on the average, an innovative teaching system in this sample can only 
expect to be one-tenth of a standard deviation better than traditional 
science teaching. Below, this mean effect size over all systems will be 
considered and discussed as a function o~ selected variables thought to be 
of interest. 



Table 1 

Mean Effect Size by Year of Publication 





No. of 


_ 


Standard 


Year 


A 


A 


Deviation 


1950 


2 


0.250 


0.014 


1951 


2 


0.870 


0.495 


1952 


1 


1.050 


0.000 


1956 


2 


0.035 


0.050 


1957 


, 2 


0.025 


0.050 


1959 


8 


-0.194 


0.334 


1960 


7 


0.069 


0.161 


1961 


29 


0.015 


0.464 


1962 


14 


-0.062 


0.377 


1963 


14 


0.054 


0.495 


1964 


9 


0.207 


0.248 


1965 


18 


0.111 


0.221 


1966 


19 


0.036 


0.259 


1967 


5 


-0.176 


0.286 


1968 


IP 


0,058 


0.378 


1969 


21 


0,097 


0.251 


1970 


27 


0,081 


0.455 


1971 


54 


0,190 


0.456 


1972 


23 


0.071 


0.493 


1973 


19 


0.007 


0.330 


1974 


17 


-0.015 


0.463 


1975 


6 


0.482 


0.282 


1976 


4 


0.443 


0.245 


1977 


9 


0.631 


0.526 


1978 


6 


0.098 


0.233 


1979 


2 


0.430 


0.325 


1980 


2 


0.000 


0.000 



In all, the 130 studies coded gave rise to 341 effect sizes distributed 
over the years 1 950 through 1980, with the bulk of the effect sizes being 
obtained in the years 1961 through 1974. The minimum mean effect size for 
any given year was -0.194 with a standard deviation of 0.334, occurring in 
1959 (based on 8 effect sizes), and the maximum mean effect size was obtained 
in 1951 with a value of 0.870 and a standard deviation of 0.495 (based on 2 
effect sizes). No overall trend is evident from the data. 



18Q 





Table 2 






Mean Effect 


Size by Form of 


Publication 






No. of 


- 


Standard 


Form 


A 


A 


Deviation 


Journal article 


96 


0.201 


0.480 


Dissertation 


214 


0.064 


0.377 


Unpubl ished paper 


25 


-0.034 


0.360 


Conference paper 


6 


0.508 


0.172 


ALL 


341 


0.103 


0,414 



Studies were reported as journal articles (producing 96 effect sizes), 
dissertations (producing 2K effect sizes), unpublished papers (producing 
24 effect sizes), or conference papers (producing 6 effect sizes). The 
mean effect size over all systems derived from studies reported ip journals 
was 0*201 with a standard deviation of 0.480, and the mean effect size 
derived from studies reported in dissertations was 0.064 with a standard 
deviation of 0.377> illustrating the selection bias noted earlier by Glass 
et al • 0 981, Chapter 7). The mean effect size derived from studiesVeported 
as unpublished papers was -*O.034 with a standard deviation of 0.36, and the 
and the mean effect size derived from studies reported at conferences was 
0.508 with a standard deviation of 0.172. 



181 



Table 3 
Mean Effect Size by Grade 





No. of 




Standard 


Grade 


A 


A 


Devia tion 

wCV IU b 1 W II 


1 


5 


0.524 


0 289 


2 


3 


-0.253 


0 ?80 




7 


0.050 


0.479 


4 


10 


-0 024 


0 151 

U» IJ 1 


5 


28 


0,121 


0.258 


6 


19 


-0.074 


0.435 


7 


28 


0.086 


0.293 . 


8 


25 


0.315 


0.491 


9 


31 


0.115 


0.263 


10 


63 


0.099 


0.406 


11 


76 


0.152 


0.420 


12 


43 


0.008 


0.548 


mis-sing 


3 






ALL 


341 


0.103 


0.414 



In the current meta-analysis, effect sizes were obtained for studies 
which drew their subjects from grades 1 through 12. The minimum mean 
effect size obtained was -0*253 with a standard deviation of 0.280 (based 
on three effect sizes) obtained in grade 2, and the maximum mean effect size 
of 0.524 with a standard deviation of 0.289 (based on 5 effect sizes) 
obtained in grade 1. No obvious relationship between magnitude of mean 
effect si?e and grade, is readily apparent in the data. 

It should be noted here that, due to the constraints of educational 
practice, such a breakdown as is being attempted here on the basis of grade 
* tends to subdivide the effect sizes obtained into subgroups differing also 
by curriculum area (i.e., students in grade TO tend to study biology, stu- 
dents in grade 11 tend vo study chemistry, etc.). 



182 



Tab! e 4 

Mean Effect Size by Assignment to Groups 





No. of 


- 


Standard 


Assignment to Groups 


A 


A 


Deviation 


Stratified random 


38 


0,010 


0.390 


Random 


79 


0.150 


0.477 


Matched 


41 


0.088 


0.339 


Intact random 


91 


0.206 


0.428 


Intact nonrandom 


86 


-0.003 


0.362 


Self-sel ected 


6 


0.142 


0,215 


ALL 


341 


0.103 


0.414 



In the current meta-analysis, an attempt was made to attach to each 
effect size a variable whose value described the method which was used to 
allocate subjects to either the experimental or control group. This vari- 
able was categorical in nature, and in all had six values, one of which was 
allocated to each effect size. In Table 4 are reported the mean effect 
sizes generated when the total mean effect size is broken down by the six 
values of this variable. 



183 



Table 5 

Mean Effect Size by Subject Matter 



Subject Matter 


No. of 

A 


- 

A 


Standard 
Deviation 


fipnpral Science 


] 00 


0.090 


0.31 5 


Life Science 


12 


0.155 


0.201 


Physical Science 


16 


0.134 


0.286 


Bioltfgy 


76 


0.150 


0.483 


Earth Science 


7 


0.084 


0.216 


Chemistry 


73 


0.146 


0.441 


Physics 


54 


-0.014 


0.508 


Other 


3 


0..093 . 


0.330 


ALL 


341 


0.103 


0.414 



Table 6 

Mean Effect Size by Type of Outcome Criterion 





Nr. k. 




Standard 


Type of Outcome Criterion 


A 




Deviation 


Cognitive: low 


61 


O.05O 


0,461 


Cognitive: high 


11 


0, 94 


0.394 


General achievement 


165 


0.L *» " 


0..~98 


Problem solving 


12 


\G.~ 




Affective toward subject 


13 


;.0";b 


0.236 


Affective toward science 


22 


•3 . 31 5 


0.333 


Affective toward method 


6 


0.217 


0,404 


Affective toward studying 


4 


0.030 


0.251 


Process skil Is 


3 


-0.107 


0.199 


Methods of science 


12 


0.350 


0.475 


Psychomotor (lab skills) 


6 


0.892 


0.684" 


Critical thinking 


7 


0.234 


0.311 


Creativity 


4 


0,430 


0.457 


Decision making 


2 


0.080 


0.014 


Logical thinking 


3 * 


0.403 


0.280 


Self-concept 


3 


0,317 


0.100 


Science perceptions 


7 


0.211 


0.298 


ALL 


341 


0.103 


0.414 



ERIC Lh6 



18.4 



Table 7 

Mean Effect Size by Origin of Instrument Used 



Method of Measurement 


No. of 






Standard 


(Type of Instrument) 


A 


I 


i 


Deviation 


Published nationally; 










standardized 


173 


0. 


045 


0,387 


Modification of 










national standardized 


27 


0. 


187 


0,365 


Ad hoc written tests 


131 


0. 


113 


0.398 


Classroom evaluation 


6 


1. 


028 


0.511 


Structured interview, 










assessment > 


2 


0. 


720 


0.453 


Missing 


2 








ALL 


3-11 


0. 


103 


0.414 



Table 8 

Mean Effect Size by Method Uaed to Calculate Effect 



Size 





No. of 




Standard 


Calculation of Effect Size 


A 


A 


Deviation 


Directly from reported or raw data 


179 


0.099 


0.435 


(means and variances) 








Reported with direct estimates 


115 


0.150 


0.408 


(AN0VA, ANC0VA, t, F) 








Directly from frequencies reported 


2 


0,255 


0.375 


on ordinal .seal es (Probit, x 2 ) 








Backwards from other variances of 


4 


-0,150 


0.145 


means with random assignment 








Nonparametrics 


8 1 


-.0.030 


0,150 


(other than #3) 








Guessed from independent 


23 


0,011 


0.233 


sources 








Estimated from variance 


6 


0.043 


0.528 


(correlation guessing) 








Estimated directly from 


2 


-0.735 


0.163 


p- value 








From percentiles \ 


2 


0.210 


0.184 


ALL 


341 


0.103 


0.414 



9 

ERIC 



s 



185 



Table 9 

Mean Effect Size by the Means Used in the 
Effect Size Calculation 





No, of 




Standard 


Source of Means 


Zi 


I 


Deviation 


Unadjusted posttest 


162 


0.125 


0.448 


Covariance adjusted 


67 


0.086 


0.387 


Pre-post differences 


93 


0.087 


0.382 


Other 


18 


0.024 


0.358 


Missing 


1 






ALL 


341 


0.103 


0.414 



9 

ERIC 



I 



186 



Mean Effect Size, System by System 

Tables 10a and 10b list the mean effect size obtained for each system 
and subsystem. In Table 10a the mean effect size on all outcome variables 
combined is presented; Table 10b shows the mean effect size on all outcome 
variables combined is presented; table 10b shows the mean effect size for 
each outcome variable (e.g. cognitive, affective, science methods, self- 
concept, etc.) for each system and subsystem. 

Since there is some variation in the way the data is consolidated 
within the two tables, a description is needed at this point. In Table 10a, 
the row labelled "ALL" includes each of the 341 individual effect sizes 
otabined from the studies integrated. The effect sizes found in the remainder 
of Table 10a, however, total up to more than 341 and their wieghted average is 
not that given in the !, ALL t! row for two" reasons. First of all, the table 
contains rows with data on various subsystems which, of course, duplicate the 
system data summarized in the line immediately above e^ch group of subsystems. 
Systems for which subsytem information is given are computer-linked, median- 
based, and programmed instruction. Second, as noted previously, some effect 
sizes have been listed in more than one system in cases where the system 
evaluated in a given study met the definition presented earlier for more chan 
one system. This duplicate listing occurred in 93 instances; essentially all of 
them are the result of an effect size being listed in both individualized 
instruction and one of several other systems . 

In Table 10b, the "All Systems 1 ' row at the bottom of the table i3 the 
weighted average of the mean effect sizes for .each of the above systems 
(information hes not been duplicated, however, by inclusion of subsystem 
information in this weighted average). This table shows th£ mean effect size 
for each outcome variable for each system as well as an overall mean effect 
size on each outcome variable for all systems combined/ 



L87 
Table 10a 



MEAN EFFECT SIZE, SYSTEM BY SYSTEM, ON ALL OUTCOME VARIABLES COMBINED 









No. 




Max. 


Min. 


System 




A 


of A 


s.d. 


A 


A 


ALL 


0. 


10 


341 


0.41 


1.74 


-0.87 


Audio-Tutorial 


0. 


1/ 


1 


0.27 


0.52 


-0.27 


Computer Linked 


0. 


13 


14 


0.58 


1.45 


-0.58 


CAI 


0. 


01 


5 


0.74 


1.23 


-0.58 


CM I 


0. 


05 


8 


0.22 


0.53 


-0.19 


CSE 


1. 


45 


1 


0 


1.45 


1.45 


Contracts for Learning 


0. 


47 


12 


0.61 


1.74 


-0.38 


Dept. E1em. School 


-0. 


09 


3 


0.17 


0.08 


-0.25 


T nH -i w i H 1 1 a 1 i 7 oH Tnct* 
X ilU 1 V lUud 1 1 L cU i ilo L • 


fl 

u . 


1 7 


m 

I J I 


\j . *tvj 


l • I H 


— U . OD 


Mastery Learning 


0 


C A 

64 


13 


0.43 


1.74 


0,08 


Media Based Instr. 


-0 


02 


100 


0.37 


1 .22 


-0.87 


- TV 


0 


.05 


40 


0.35 


0.77 


-0.87 


Film 


-0 


.07 


58 


0.38 


1.22 


-0.74 


SI ides 


-0 


.47 


1 


0 


-u / 


-U .*f / 


Tapes 


-0 


.27 


1 


0 


-0.27 


-0.27 


PSi (Pers. Syst. Inst.) 


0 


.60 


15 


0.42 


1.74 


0.08 


Programmed Instr. 


0 


.17 


52 


0.48 


1.36 


-0.82 


Branched 


0 


.21 


5 


0.80 


1.23 


-0.42 


Linear 


0 


.17 


47 


0.44 


1.36 


-0.82 


Self-Di rected 


0 


.08 


27 


0.38 


0.87 


-0.58 


Source Papers 


0 


.14 


13 


0.21 • 


0.48 


-0.19 


Student Assisted 


0 


.09 


6 


0.17 


0.34 


-0.13 


Team Teacning 


0 


.05 


41 


0.38 


1.36 


-0.76 



J. 



4 

MEAN EFFECT SIZE ON EACH OUTCOME ("VARIABLE FOR EACH SYSTEM AND SUBSYSTEM 



Science 



Critical Logical 



Self- 



Cognitive Affective Methods Psychomotor Thinking Thinking Creativity Concept 



System 


A 


n 


A 


on 


A 


n 


A 


n 


A 


n 


MUUIU-IUUUi Jul 


no 


c 
J 


. Jo 


i 
i 














Computer-Linked 


.22 


11 


- . 17 


3 














CAI 


.16 


4 


-.58 


1 














CMI 


.05 


6 


.04 


2 














per 


l .45 


i 


















Contracts for 






















Learning 


.22 


5 


.33 


3 


1.24 


2 






.53 


2 


Dept. Elem. Sch. 


-.09 


3 


















Individual i zed Instr. 


.12 


102 


.16 


10 


.43 


9 


1.17 


2 


33 




Mastery Learning 


.50 


8 


.52 


2 


1.2* 


2 






.89 


1 


Media Based Instr. 


-.03 


75 


-.10 


16 


.12 


5 


-.08 


1 


16 


Cm 


T.V. 


.02 


33 


-.12 


1 


.17 


4 






IB 


± 


Fi 1m 


-.06 


40 


- .10 


15 


- .10 


1 


-.08 


1 


. 17 


1 


Slides 


-.47 




















Tapes 


-.27 


l 


















P.S.I. 


.49 


7 


.52 


2 


1 .24 


2 






.89 


1 


Programmed Instr. 


.17 


51 


.20 


1 














Branched 


.21 




















Linear 


.17 


46 


.20 


1 














Self-Directed 


-.12 


15 


-.10 


3 


-.11 


1 






.17 


1 


Source Papers 


.14 


9 


-.19 


1 


.25 


3 










Student Assisted 


.11 


2 


.17 


2 










.02 


1 


Tear" Teaching 


.09 


31 


-.12 


7 


.18 


3 










ERJCf SYSTEMS 


.10 


325 


.04 


51 


.47 


28 


.75 


3 


.39 


12 



.40 



.50 2 .50 2 

.77 1 
.77 1 



17 1 .40 3 .50 2 



.04 1 



A2 



.37 



.42 1 



19, 



189 



Group Sizes, System by System 

For each effect size in the- current meta-analysis, the final sizes 
of the treatment and control groups were recorded* 



Table 11 

Final Size of Treatment Group Within Each System 



* 


No. cf 

A c 

AS 


Mean 


Maximum 


Minimum 


ATT A _ 

All As 


0/11 

' 341 


TOO n 

1 22.9 


>999 


1 A 

1 4 


Audio-Tutorial 


7 


39.7 


57 


1 c 

1 5 


Computer Linked 


1 A 

14 


111 c 

L6 


o o o 
*:3Z a 


24 


CAI 


b 


38.2 


CO 

oo 


O/l 

24 


CMI 


o 
o 


1 C7 Q 

i o/ .y 


232 


OA 

24 


CSE 


1 x 


On 
29 


^y 


on 

2y 


Contracts for Learning 


1 O 

1 z 


oi * n 
31 .9 


C 0 

oo 


on 

20 


Departmentalized El. Sch. 


3 


2o4.3 a 


C AC 
o4u 


/U 


iu < IV luud i izkzu xnbtrucuiun 


1 O 1 


fin i 


3?1 

Sj Cm 1 


14 


Mastery Learning 


13 


24.5 


35 


20 


Media Based 


100 


229.2 


>999 


15 


Television 


40 


242.2 


>999 


" 70 


Film 


58 


-227.9 


919 


22 


Personalized System of Inst 


15 


30.8 


52 


20 


Programmed Instruction 


52 


73.7 


186 


18 


Branched 


5 


39.0 


58 


26 


Linear 


47 


77.7 


186 


18 


Self-directed Study 


27 


51.3 


122 


23 


Source Papers 


13 - 


35.7 


50 


25 


Student Assisted 


6 


62.7 


68 


48 


|Team Teaching 


" 41 


100.3 


261 


25" 



x Table 12 
Final Size of Control Group Within Each System 





No. of 

A s 


Mean 
n 


Maximum 


Minimum 


All 


341 


122.9 


900 


1 5 


Audio-Tutorial 


7 


40.7 


56 


15 


Computer Linked 


14 


87.9 


233 


20 


CAI 


5 


34.2 


52 


20 


c CMI 


8 


127.5 


233 


23 


CSE 


1 


39 


39 


39 


Contracts for Learning 


12 


28.9 


49 


20 


Depart! . El em, Sch. 


3 


356.7 


707 


175 


Individualized Instr. 


131 


65.1 


499 


15 


Mastery Learning 


13 


19.7 


23 


18 


Media rased 


100 


181 .1 


900 


17 


Tel evision 


40 


145.2 


520 


70 


Film 


58 


212.4 


900 


17 


PSI 


15 


26.7 


51 


18 


Programmed Instr. 


52 


bo.b 


1-/0 


1 O 


Branched 


5 


36.6 


52 


26 


Linear 


47 


c 72.2 


176 


18 


Self-directed Study 


27 


48.0 


98 


20 


Source Papers 


13 


35.7 


50 


25 


Student Assisted 


6 


48.3 


*64 


25 


Team Teaching 


41 


103.2 


338 


25 



0 



Summary Data for the Individual Systems 

On the following pages, summary data are given and discussed for each 
of the teaching systems taken separately. The systems are arranged in 
alphabetical order, and in each case a summary data table is included. 

Audio-Tutorial System . Seven effect sizes were obtained for the 
Audio-Tutorial system, with a mean effect size of 0.170 and a standard 
deviation of 0.274. The effect sizes were obtained from studies performed 
in the years 1970, 1972, 1974, and 1976, and although the mean effect 
sizes generated in each of these years vary considerably, definitive state- 
ments concerning their relative magnitude are difficult to make due to the 
small number of effect sizes concerned. However, the maximum mean effect 
size of 0.335 (based on 2 effect sizes) was obtained in 1976 and the mini- 
mum effect size of 0.000 Cbased on 1 effect size) was obtained in 1970. 

When the effect sizes are broken down by form of reporting, the mean 
effect size obtained from journal articles is 0.223, with a standard devi- 
ation "of 0.268 (based on 3 effect sizes) and the mean effect size from 
dissertations is 0.130 with a standard deviation of 0.312 (based on 4 effect 
sizes,). Effect sizes were obtained in grades 3, 4, 9, and JO, the minimum 
being obtained in grade 3 with a mean effect size of 0.000 (based on 1 
effect size), and a maximum in grade 10 with a mean effect size of 0.335 
(based on 2 effect sizes). All effect sizes in the audio-tutorial system 
sample were produced by studies which used randomized allocation of subjects 
to groups, and hence no statements can be made concerning the variations of 
mean effect sizes over the variable VALDESN (a variable to measure the 
validity of the experimental- design). 

Systems using the audiotutorial technique were evaluated in two curri- 
culum areas: general science and biology. The mean effect size obtained 



192 



in biology was greater than the mean effect size obtained in general 
science with biology having a mean effect size of 0.230 (based on 5 effect 
sizes) and general science having a" mean effect size of 0*020 (based on 
2 effect sizes). 

Effect sizes were obtained on various types of outcome criteria in 
studies that evaluated the audio-tutorial system. Effect sizes based on 
cognitive outcome criteria registered a mean effect size of 0.088 and a 
standard deviation of 0*287 (tesed on 5 effect sizes} and effect si-zes 
based on affective outcome criteria registered a mean effect size of 0.330 
(based on 1 effect size). The tests that were used to evaluate the effect 
size of the audio-tutorial system were published tests, modified published 

tests, and ad hoc tests produced by the investigator. Effect sizes 

f 

generated by studies whidh made use of published test materials produced 
a mean effect size of 0.j375 with a standard deviation of 0.064 (based on 

2 effect sizes) whereas ;effect sizes produced by studies which made use of 

s 

modified published tests and ad hoc tests taken together had a mean effect 
size of 0.088 and a standard deviation of 0.287 (based on 5 effect sizes). 

Studies which reported the raw data in the account of the investigation 
were able to generate effect sizes directly from the raw data and such 
studies produced a mean effect size of 0.130, with a standard deviation of 
0.312 (based on 4 effect sizes) . Other studies reported the results of 
their investigation as a statistic or group of statistics (t, F, etc.) and 
mean effect sizes were calculated by methods due to Glass et al . (1981), 
producing a mean effect size of 0.335 with a standard deviation of 0.262 
(based on 2 effect sizes). 



Table 13 
AUDIO-TUTORIAL SYSTEM 

A = 0,170 
s.d. - 0,274 , 
N = 7 



Standard 
Deviation 



Standard 
Deviation 



By Year of Publication 

1970 0,000 

1-972 0.040 

1974 0.160 

1976 0.335 

By Form of Reporting 

Journal article 0.223 

Dissertation 0.130 

By Grade Level of Subjects 

3 0.000 



4 
9 

10 

By Validity of Design 
Random 

By Subject Matter 
General Science 
Biology 



0,040 
0.160 
0.335 



0.170 



0,020 
0.230 



0.000 
0.000 
0.375 
0.262 



0.268 
0.312 



0,000 
0.000 
0.375 
0.262 



0.274 



0.028 
0.311 



1 
1 

3 
2 



3 
4 



2 
5 



By Immediate or Retention 
Immediate 0.223 
Missing information 

By Type of Outcome Criterion 
Cognitive : 0TO88 
Affective * 0.330 

. Self-Concept 0.420 



0.268 



0.287 
0.000 
Q.000 



By Method of Measurement (Instrument) 

Published 0.375 0.064 
Modified published 

& Ad hoc 0.088 0.287 

By Calculation of Effect Size 

From' raw data 0.130 0.312 

By direct calc, 0.335 0.262 

Less trustworthy 0,000 0.000 

By Source of Means 

Unadjusted posttest 0.040 0.000 

Pre-post differences 0.230 0,311 

Other 0.000 0,000 



3 
4 



5 
1 
1 



4 
2 
1 



1- 1 



l.'JV 



o 

ERIC 



194 



Computer-linked Systems . Studies addressing the efficacy, of computer- 
linked systems generated 14 jsffect sizes with a mean of 0.134 and standard 
deviation of 0 • 583. The studies which gave rise to these effect sizes were 
performed in the years 1965 (4 effect sizes), 1271 (5 effect sizes), 1972 
(4 effect sizes), and 1975 (1 effect size); studies performed :n and before 

1971 yielded negative effect sizes and studies performed after and including 

1972 yielded positive effect sizes. The maximum effect size was produced 
in, 1972, and was 0.575 with a standard deviation of 0.943 <based on 4 effect 
sizes) and the minimum effect size was produced in 1971 and was -0.148 with 
a standard deviation of 0.243 (based on 5 effect sizes). 

Studies reported in journals gave rise to a mean effect size of 1.340 

with a standard deviation of 0.156 (based on 2 effect sizes) whereas studies 

•t 

reported in dissertations had a mean effect size of -0.121 with a standard 
deviation of 0.247 (based on 11 effect sizes). Studies were performed at 
grades 5, 10, 11, and 12, tha majority being performed in the senior grades. 
It appears as though the mean effect size increases with gra-fie; however, 
such trends can be regarded as having no significance due to the small size 
of the sample addressed. 

Studies performed in order to evaluate the effect size of computer- 
linked systems made use of randomized allocation of subjects to groups, 
random allocation of intact groups, and nonrandomized allocation; those 
studies which made i.se of randomized allocation generated a mean effect 
size of 0.470 with a standard deviation of 1 .009 (based on <* effect sizes) 
whereas studies which used nonrandomized allocation produced a mean effect 
size of -0.053 with a standard deviation of 0.114 (based on 4 effect sizes). 
Studies were performed in the curriculum areas of general science, chemis- 
try and pJiysics; the mean effect size obtained for chemistry and physics 
were similar in magnitude (0,143 and 0.174) and substantially larger than 

ERIC « 19 J 



195 



that obtained in general science CO. 020) . 

Of the 14 effect sizes obtained from studies evaluating this system, 
11 were obtained by immediate assessment of students at the conclusion of 
the experimental and control group interventions and the remaining 3 effect 
sizes were intended to evaluate the retention effects of the interventions. 
The mean effect size for immediate assessment was 0.221 with a standard 
deviation of 0.626 (based on 11 effect sizes) and the mean effect size for 
evaluating retention effects was -0.183 with a standard deviation of 0.240 
(based on 3 effect sizes), indicating that positive effects generated by 
the computer linked systems in students involved in the experimental inter- 
vention decayed with time relative to students in the control group* 

Effect sizes based on cognitive outcome criteria produced a mean effect 
size of 0.216 with a standard deviation of 0.6]8 (based or 11 effect sizes) 
and effect sizes based on affective outcome criteria produced a mean effect 
size of -0.167 with a standard deviation of 0*359 (based on 3 effect sizes). 
Studies which made use of published test materials generated a mean effect 
size of -0.158 with a standard deviation of 0.256 (based on 5 effect sizes), 
and studies which made use of modified published test materials and ad hoc 
test materials taken together generated a mean effect size of 0.297 with a 
standard deviation of 0.66> (based on 9 effect sizes), showing that it was 
more likely for investigators who authored their own evaluation instruments 
to register a larger effect size. 

In those cases in which effect sizes were ab^e to be calculated from 
raw data reported in the studies themselves, a mean effect size of 0.149 
with a standard deviation of 0,64 (based on 11 effect sizes) was produced, 
whereas effect sizes calculated from reported statistics produced a mean 
effect size of -0.145 with a standard deviation of 0.064 (based on 2 effect 
sizes) ♦ 



Standard 
A Deviation 



By Year of Publication 

1965 -0,053 0,114 

1971 -0,148 0,243 

1972 0.575 0.943 
1975 0,530 0.000 

By Form of Reporting 

Journal 1,340 0.156 

Dissertation -0,121 0.247 

Conference Paper 0.530 0.000 

By Grade Level of Subjects 

5 0.020 0.108 

10 -0.053 0,114 

11 0,143 0,941 

12 0.400 0,841 

By Validity of Design 

Random 0,470 1 .009 
Matched & Intact 

Random 0.035 < 0,367 

Nonrandom -0.053 0,114 

By Subject Hatter 

General S:ience 0,020 0.108 

Chemistry 0.143 • 0,941 

Physics 0,174 0.606 



ERJC 



^ ' J. 



Table 14 
"ER- LINKED SYSTEMS 

A = 0.134 
s.d/= 0,583 
N = 14 

Standard 

N_ A Deviation JN 

By Immediate or Retention 

4 Immediate 0.221 0,626 11 

5 Retention -0,183 0,240 3 
4 

1 " By Type of Outcome Criterion 

Cognitive OTTl 6 0,618 11 

Affective -0.167 0,359 3 

2 

11 By Method of Measurement (Instrument) 

1 Published -0,158 0,256 5 

Modified Published • • £ 

& Ad hoc 0.297 0,661 9 <* 

3 

4 By Calculation of Effect Size 

3 From raw data 0~7T4~9 0,640 11 

4 By direct calculatn. -0.145 0,064 2 
Less trustworthy 0,530 0.000 • 1 



3 
3 
8 





Unadjusted posttest 


-0.174 


0,269 


8 


6 


Pre-posttest 








4 


differences 


0.548 


0.731 


5 




Other 


0.530 


0.000 


1 



n J 



197 



Computer Assisted Instruction * Five effect sizes with a mean of 0.010 
and a standard deviation of 0.743 were obtained from studies which evaluated 
computer, assisted instructional system?. These studies were performed in 
the years 1971 and 1972. The mean effect size ror studies in the year 197l< 
was -0.400 with a standard deviation of 0.O28 (based on 2 effect sizes) and 
the mean effect size produced by studies performed in the year 1972 was 
0.283 with a standard deviation of 0.908 (based on 3 effect sizes). Effect 
sizes for the CAI system were reported both in journals (with a mean of 
1.230) and dissertations (with a mean of -0.295). Students in grades 11 
and 12 were used as subjects for CAI evaluation, and a mean effect "size of 
0.143 with a standard deviation of 0.941 (based on 3 effect sizes) was 
obtained for grade 11 subjects and a mean effect size of -0.190 with a 
standard deviation of 0.552 (based on 2 effect sizes) was obtained for 
\9rade 12. 

Studies which made use of the randomized allocation of subjects to 
groups produced a larger mean effect size of 0.143 than studies which made 
use of matched subjects or the random allocation of intact groups (with 
a mean effect size of -0.190). 

The two curriculum areas which were addressed in these studies were 
chemistry (with a mean effect size of 0.143) and physics (with a mean 
effect size of -0.190). Of the 5 effect sizes, four were generated by 
immediate evaluation of experimental effects, giving rise to a mean effect 
size of 0.118 with a standard deviation of 0.812, and one was generated by 
delayed evaluation (a retention effect) giving rise to an effect size of 
-0.420. Both cognitive and affective outcome criteria were evaluated, 
with cognitive measures producing a mean effect size of 0.158 with a stand- 
ard deviation of 0.769 (based on 4 effect sizes), and affective measures 

ERIC ^ J 



producing a mean effect size of -0.580 (based on 1 effect size). 

The mean effect size produced by studies which made use of published 
test materials was -0.580 and the mean effect size produced by studies 
which made use of modified published and ad hoc materials taken together ' 
was 0.158 with a standard deviation of 0.769 (based on 4 effect sizes). 



By Year of Publication 

1971 -0.400 

' 1972 0.283 

By Form of Reporting 

Journal article 1.230 

Dissertation -0.295 



By Grade Level of Subjects 



77 
12 

* 

By Validity of Design 
Random 

Matched & Intact 
Ranaom 

By Subject Matter 
Chemistry 
Physics 



T, 143 
-0.190 



0.143 
-0,190 



0.143 
-0,190 



Table 15 
COMPUTER ASSISTED INSTRUCTION 



Standard 
Deviation 



0.028 
0.908 



0,000 
0,341 



0.941 
0.552 



0,941 
0.552 



0.941 
0,552 



A 

s.d. 

N 

N 



2 
3 



1 

4 



3 
2 



3 
2 



3 
2 



0.010 
0.743 
5 



By Immediate or Retention 
Immediate 0,118 
Retention -0,420 

By Type of Outcome Criterion 
Cognitive 0,158 
Affective -0.580 



Standard 
Deviation 



0.812 
0.000 



0.769 
0.000 



By Method of Measurement (Instrument) 
Published -0.580 0,000 

Modified published 
& Ad hoc 0.158 



By Calculation of Effect Size 
From raw data . 1.010 

By Source of Means • 

Unadjusted posttest -0,295 
Pre-post differences 1,230 



0.769 
0.743 



0.341 
0.000 



4 
1 



1 

4 



4 
1 



ERJC 



^ 1 1 



» 



Computer Managed Instruction . Teaching systems based on the use of 
computer managed instruction had a mean effect size of 0.048 with a standard 
deviation of 0.220 (based on 8 effect sizes). These studies were performed 
in the years 1965 (4 effect sizes), 1971 (3 effect sizes), and 1975 (1 effect 
size). The maximum effect size was obtained in 1975 and was 0.530, and the 
minimum mean effect size was obtained in 1965 and was -0.053. The mean 
effect size for studies reported as dissertations was -0.021 with a standard 
deviation of 0.109 Cbasedon 7 effect sizes) and the mean effect size for 
studies reported at conferences was 0.530 (based on 1 effect size). 

Grades 5, 10, and 12 were used to provide subjects '"for studies inves- 
tigating computer managed instruction, the largest mean effect size being 
produced in grade 12, the minimum in grade 10; however, the small size of 
the sample of effect sizes being discussed here prevents definitive state- 
ments being made concerning any trend across grade level. 

None of the studies being addressed here made use of the randomized 
allocation of subjects to groups; however, studies which made use of 
matched 'allocation and the random allocation of intact groups (taken 
together) produced a mean effect size of 0.148 with a standard deviation 
of 0.270 (based on 4 effect sizes) and studies which made use of nonran- 
domized allocation produced a mean effect size of -0.053 with a standard 
deviation of 0.114 (based on 4 effect sizes). The curriculum areas of 
general science and physics were the basis of the evaluation of the computer 
managed instruction system in the sample of studies being meta-analyzed 
here. In both cases, the mean effect sizes were positive but close to zero. 

Of the 8 effect sizes in this subsection, 6 address the question of 
immediate effects, giving a mean effect size of 0.085 and 2 address the 
question of delayed effects, giving a mean effect size of -0.065. Those 

ERIC / 2(1/ 



effect sizes for which cognitive outcome criteria were employed generated 
a mean effect size of 0.050 with a standard deviation of 0,260 (based on 
6 effect sizes) and those effect sizes for which affective outcome cri- 

i 

teria were employed generated a mean effect size of 0.040 with a~ standard 
deviation of 0.028 (based 0 n 2 effect sizes). A mean effect size of -0.053 
with a standard deviation of 0.114 (based on 4 effect sizes) was obtained 
in those cases in which published test materials were used, a mean effect 
size of 0.148 with a standard deviation of 0.27 (.based on 4 effect sizes) 
was obtained in those cases in which modified published test materials 
and ad hoc test materials were employed. Effect sizes deriving from the 
meta-analysis of reported raw data produced a mean effect size of O.028 
with a standard deviation of 0.079 (based on 5 effect sizes), whereas 
effect sizes based on the transformation of reported statistics produced 
a mean effect size of -0.145 with a standard deviation of 0.064 (based 
on 2 effect sizes). 



ERIC 2Vo 



Table 16 
COMPUTER MANAGED INSTRUCTION 

A = 0,048 
s.d. = 0.220 
N = 8 







Standard 






Standard 






A 


Deviation 


N 


A 


Deviation 


N 


By Year of Publication 








By Immediate or Retention 






1965 


-0,053 


0,114 


4 


Immediate 0,085 


0.234 


6 


1971 


0,020 


0.1 08 


3 


Retention -0.065 


0,177 


2 


1975 


0,530 


0,000 


1 
















By Type of Outcome Criterion 






By Form of Reporting 








Cognitive 0,050 


0,260 


6 


Dissertation 


-0,021 


0,1 09 


7 


■ Affective 0,040 


0,028 


2 


Conference Paper 


0,530 


0,000 


1 














By Method of Measurement (Instrument) 




By Grdde Level of Subjects 






Published -0,053 


0,114 


4 


5 


0,020 


0.1 08 


3 


Modified published 






10 


-0,053 


0J14 


4 


& Ad hoc 0.148 


0.270 


4 


12 


0,530 


0.000 


1 
















By Calculation of Effect Size 






By Val idity of Design 








From raw data 0,028 


0,079 


5 


Matched & Intact 








By direct calc, . -0,145 


0,064 


2 


Random « 


0,148 


0,270 


4 


Less trustworthy 0,530 


0,000 


1 


Nonrandom 


-0,053 


0,114 


4 
















By Source of Means 






By Subject Matter 








Unadjusted posttest -0,053 


0,114 


4 


General Science 


0,020 


0.108 


3 


Pre-post differences 0,020 


0.108 


3 


Physics 


0,064 


0.279 


5 


Other 0,530 


0,000 


1 



21) J 



erJc 



203 



Computer Simulated Experiments , In this meta-analysis, a single 
effect size of 1.450 was obtained for systems purporting. to evaluate the 
use of computer simulated experiments, and as a consequence the breakdown 
of this effect size by other variables in the analysis was unable to be 
performed. 



Table 17 
COMPUTER SIMULATED EXPERIMENTS 

A = 1 .450 
s.d. = 0.000 
N = 1 

Published 1972 
Journal article 
Grade 12 

Random assignment 
Physics 
Immediate 
Cognitive 

Modified published & ad hoc 
From raw data 
Pre-post differences 



2U 



204 

Contracts for Learning . A mean effect size of 0.467 with a standard 
deviation of 0.605 (based on 12 effect sizes) was obtained for studies 
purporting to evaluate the use of contracts for- learning. These studies 
were performed in the early 1970s and generated a maximum mean effect size 
of 0.857 with a standard deviation of 0.467 (based on 7 effect sizes) in 
1971, and a minimum mean effect size of -0.255 with a standard deviation 
of 0.177 (based on 2 effect sizes) in 1974. The mean effect size obtained 
from studies reported in journals CO. 610) was considerably higher than the 
mean effect size obtained from studies reported as dissertations (0.040). 
The contracts for learning system was evaluated in grades 8, 9, and 11, 
the maximum mean effect size being obtained in grade 8 (0.857) and the 
minimum. mean effect size being obtained in grade 9 (-0.255). 

Studies which made use of the randomized allocation of subjects to 
groups produced a mean effect size of 0.857 with a standard deviation of 
0.467 (based on 7 effect sizes) and studies which made use of matched 
allocation and the random allocation of intact groups taken together pro- 
duced a mean effect size of -0.078 with a standard deviation of 0.201 
(based on 5 effect sizes). The curriculum areas of biology and chemistry,, 
were the areas that were used in the evaluation of this teaching system. 
Nine effect sizes with a mean of 0.610 and a standard deviation of 0.639 
were obtained in the curriculum area of biology, and 3 effect sizes with 
a mean of 0.040 and a standard deviation of 0.114 were obtained in chemistry. 

Of the 12 effect rizes seeking to address the effectiveness of the 
contracts for learning system, 11 concerned the evaluation of immediate 
effects (with a mean effect size of 0.522) and one sought to evaluate delayed 
effects Cwith an effect size of -0.130). Those effect sizes based on cog- 
nitive outcome criteria produced a mean effect size of 0.218 with a standard 



205 



deviation of 0.569 (based on 5 effect sizes), and those based on affective 
outcome criteria produced a larger mean effect 'size of 0.330 with a stand- 
ard deviation of 0.449 (based on 3 effect sizes). All effect sizes genera- 
ted in this subsection originated in the use of published test materials 
as the basis of the outcome measures. 

Effect sizes calculated on the basis t of reported raw data amounted to 
five in all, and had a mean of -0.078 with a standard deviation of 0.201 
and those calculated on the basis of the transformation of reported statis- 
tics had a mean of 0.857 with a standard deviation of 0.457 (7 effect sizes 
in all ) . 



By Year of Publication 

1971 0.857 

1972> 0.040 

1974 -0.255 

By Form of Reporting 

Journal article 0,610 

Dissertation 0,040 

By Grade Level of Subjects 

8 0.857 

9 -0,255 
11 0.040 

By Val idity of Design 

Random 0,857 
Matched & Intact 

Random -0.078 

By Subject Matter 

Biology 0,610 

Chemistry 0.040 



Table 18 
CONTRACTS FOR LEARN I NQ 

A = 0.467 
s,d. = 0.605 
N = 12 



"Standard _ Standard 

D&viation A Deviation 

By Immediate or Retention 

0.467 7 Immediate 0,522 0,603 

•0.114 3 Retention -0,130 0,000 
0.177 2 

By Type of Outcome Criterion 

Cognitive 07218 0,569 

0,639 9 Affective 0.330 ' 0.449 

0.114 3 Science Methods 1 ,235, 0,714 

Critical Thinking 0.530 0,509 

0,467 7 By Method of Measurement (Instrument) 

0.177 2 Published 0,467 0,605 
0,114 3 

' By Calculation of Effect Size 

,!"" From raw data -0,078. 0,201 

0.467 7 4, By direct calculatn, 0,857 0.467 

0,201 5 By Source of Means 

Unadjusted posttest -0.078 0,201 

Covarlance adjusted 0,857 0.467 

0.639 9 
0,114 3 



! 

\ 



207 



Departmentalized tlementary Scho61 . The departmentalized elementary 
school system (with a mean effect size of -0.090 and a standard deviation 
of 0.165 based on 3 effect sizes) was evaluated in studies reported in the 
years 1963, 1967, and 1969. The maximum effect size of 0.080 was obtained 
in 1963 and the minimum effect size of -0.250 was obtained in 1969. Effect 
sizes calculated from data reported in journal articles generated a mean 
of 0.080 whereas effect sizes calculated from data reported in dissertations 
had a lower mean of -0.175. 

Those studies in which the raw data itself was reported produced a 
mean effect size of 0.080, while those studies which reported their out- 
comes as a statistic or group of statistics produced a mean effect size of 
-0.250. 



o 1 » 



Table 19 

DEPARTMENTALIZED ELEMENTARY SCHOOL 



A 

s.d, 
N 



-0.090 
0.165 

3 



By Year of Publication 

1963 0,080 

1 967 -0,100 

1969 -0,250 

B y Form of Reporting 

Journal article C.080 

Dissertation -0.175 

By Grade Level of Subjects 

5 -0,175 

6 0.080 

By Validity of Design 

Nonrandom -0,090 

By Subject Matter 

General Science -0,090 



Standard 
Deviation 



0,000 
0,000 
0.000 



0,000 
0.106 



0,106 
0,000 



0,165 
0,165 



3 
3 



By Immediate or Retention, 
Immediate -0,090 

By Type of Outcome Criterion 
Cognitive -0,090 



Standard 
Deviation 



0,165 



0,165 



By Method of Measurement (Instrument) 

Published -0,090 0,165 

By Calculation of Effect Size 

From raw data 0,080 0,000 

By direct calc, -0,250 0,000 

Less trustworthy -0.100 0,000 

By Source of Means 

Covariance adjusted -0.010 0.127 

Pre-post differences -0,250 . 0.000 



H 
3 
3 
3 



O 

00 



Ay I J 



ERJC 



o i 7 



205 4 

Individualized Instruction . Studies purporting to evaluate the use 
of systems based on the techniques of individualized instruction yielded 
131 effect sizes. The mean effect size thus obtained was 0.174, with a 
standard deviation of 0.459. Studies addressing this question were 
reported between the years 1961 and 1980 with the majority of effect sizes 
being produced between 1969 and 1974. The maximum effect size of 0.806 
was obtained in 1961 and the minimum effect size of -0*200 was obtained in 
1967. The mean effect size for individualized instruction systems broken 
down by form of reporting illustrates a trend which is apparent in other 
facets of this meta-analysis, with the mean effect size for studies repor- 
ted in journal articles tO.405) being considerably larger than the mean 
effect size for studies reported as dissertations (0.1 02) # 

The individualized instruction system was evaluated in grades 3 through 
12, with the bulk of the evaluations being performed in grades 7 through 
11; the mean effect sizes thus produced ranged from -0.100 to 0.467. Stud- 
ies which made use of the randomized allocation of subjects to experimental 
groups produced a mean effect size of 0.215 with a standard deviation of 
0.494 (based on 56 effect sizes), whereas studies which made use of 
matched allocation or the random allocation of intact groups produced a 
mean effect size of 0.175 with a standard deviation of 0.442 (based on 
53 effect sizes). Studies which made use of the nonrandomized allocation 
of subjects to experimental groups produced a mean effect size of 0.070 
with a standard deviation of 0.409 (based on 22 effect sizes). A trend of 
decreasing mean effect sizes is evident here as we move from true experi- 
mental designs through quasi-experimental designs of decreasing trust- 
worthiness. 

21 J 



210 



The individualized instruction system was evaluated across many curri- 
culum *ac_ea s^wi th thg maximum mean effect size of 0.430 being obtained in 



obtained in the earth science curriculum, although in both "of "threse -area s_. 
only a single effect size was generated. In those areSs in which the bulk 
of the effect sizes were produced (.viz., general science with 36 effect 
sizes, biology with 30 effect sizes, and chemistry with 43 effect sizes), 
the maximum mean effect size of 0.265 with a standard deviation of 0.550 
was obtained in biology and the minimum mean effect size~D^0 v 016 with 
a standard deviation of 0.252 was obtained in general, science. 

Of the 120 effect sizes which were coded as appropriate to immediate 
or retention effects in this area Cwith 11 effect sizes having a missing 
value on this variable), 108 addressed the question of immediate effects, and 
12 addressed the question of retention effects. For those effect sizes 
which dealt with the immediate evaluation of experimental effects, the mean 
effect size was 0.22 with a standard deviation of 0.482, and for those 
effect sizes which dealt with the delayed evaluation of experimental 
effects, the mean effect size was -0.109 with a standard deviation^f-0v234^ 
illustrating that any difference in effect between experimental and control 
interventions decreased as time passed after the conclusion of the inter- 
ventions. In fact, although individualized instruction is seen to be more 
effective than traditional instruction immediately on conclusion of the 
experimental treatment, once the treatment is withdrawn, and time has passed, 
traditional instruction is the system which retains its influence more 
effectively than individualized instruction. ^ .~ . 

The mean effect size generated by the individualized instruction sys- 
tem broken down by type of outcome criterion reveals a bewildering 



the life science curriculum Tnid^he^minjjnuinL effect size of 0.000 being 



ERIC 




211 



variability; however comparisons of mean effect sizes due to cognitive and 
affective outcome criteria indicate (given the extant standard deviations) 
little difference between the two. The mean effect size for those outcomes 
based on cognitive criteria is 6.118 with a standard deviation of 0.440 
J based on 102 effect sizes), whereas the mean effect size for those outcomes 
based on affective criteria is 0.160 with a standard deviation of 0.373 
(based on .10 effect sizes).. 

Those effect sizes which resulted from the use of published test 
materials revealed a mean which was of the same magnitude as the mean 
effect size obtained by use of modified published test materials and ad 
hoc test materials taken ■ -together (with the standard deviation in the two 
categories also being similar). The mean effect size attributed to the use 
of published test materials was 0.159 with a standard deviation of 0.442 
(based on 65 effect sizes) and the mean effect size attributed to the use 
of modified published and ad hoc test materials being 0.159 with a standard 
deviation of 0.453 (based on 64 effect sizes). The mean effect size 
generated from those studies which reported raw data (0.176 with a standard 
deviation of 0.476 based on 72 effect sizes) was of a smaller magnitude 
than the effect size generated from those studies which required effect 
size^irl-cul-atiojij^ of common statistics. The mean effect 

size generated by such transformatTorTs was 0.23.6. with a standard deviation 
of 0.469 (.based on 45 effect sizes). 



\ 



<~C. J. 



I 



Table 20 
INDIVIDUALIZED INSTRUCTION 

A = 0.174 
• s.d. * 0.459 
N = 131 





_ 


Standard 






A 


Deviation 


N 


By Year of Publication 








1961 


0.806 


0,438 


5 


1963 


0.1 90 


0,651 


2 


1964 


0.387 


0,224 


3 


1965 


0.047 


0,133 


3 


1966 


0,011 


0.233 


8 


1967 


-0,200 


0.400 


3 


1968 


0.076 


0.397 


9 


1969 


0,067 


0.244 


15 


19/0 


0,04/ 


0.392 


1 9 


1971 


0.334 


0,530 


27 


1972 


0.316 


0,639 


1 0 


1 j 


u , uyo 


U .DUO 


c 
0 


1974 


-0,059 


0,500 


n 


1975 


0,507 


0.376 


3 


1976 


0,347 


0.186 


3 


1977 


0,210 


0.184 


2 


1980 


0.000 


0,000 


2" 


By Form of Reporting 








Journal article 


0.405 


0.519 


29 


Dissertation 


0.102 


0.422 


100 


Conference paper 


0,450 


0,113 


2 



(continued) 



erjc 



A 

By Grade Level of Subjects 


Standard 
Deviation 


n 


3 


0.000 


0.000 


l 


4 


- -0,007 


0,042 


3 


5 


0.116 


0.302 


7 


6 


-0.1 00 


0.461 


8 


7 


0.027 


0.226 


17 


8 


0.404 


0,585 


1 5 


9 


0,192 


0.328 


14 


1 0 


0.112 


0.440 


20 


11 


0,215 


0,493 


40 


12 


0.467 


0.723 


6 


By Validity of Design 








Ka nuOiTl 


U , L 1 0 


0.494 


56 


Matched & Intact 








Random 


0.175 


0.442 


53 


Nonrandom 


0,070 


0,409 


22 


By Subject Matter 








General Science 


0.016 


0.252 


36 


Life Science 


0.430 


0.000 


1 


Physical Science 


0.216 


0.271 


10 


Biology 


0.265 


0.550 


30 


Earth Science 


0.000 


0,000 


1 


Chemistry 


0,204 


0.508 , 


43 


Physics 


0.322 


0,652 


9 


Other 


0.030 


0,000 


1 






22 J 





c 



Individualized Instruction, continued 



ERJC 



By Immediate or Retention 



Immediate 
Retention 



0,220 
-0,109 



Standard 
Deviation 



0.482 
0,234 



By Type of Outcome Criterion 



By Method of Measurement (Instrument) 



108 
12 



Cognitive < 
Affective 


0,118 


0,440 


102 


0,160 


0.373 


10 


Science Methods 


0.428 


0,565 


9 


Psychomotor 


1 .1 65 


0.064 


2 


Critical Thinking 


0,325 


0.405 


4 


Creativity 


0,495 


0.530 


2 


Sel f-Concept 


0,365 


0.078 


2 



Publ ish'ed . 


0,159 


0.442 


65 


Modified published 








& Ad hoc 


0.159 


0,453 


64 


Other 


1 ,165 


0.064 


2 



By Calculation of Effect Size 
From raw data 0,176 
By direct calculation 0,236 
Less trustworthy -0,032 



By Source of Means 
Unadjusted posttest 
Covariance adjusted 
Pre-post differences 
Other 



Standard 
Deviation 



0,476 
0.469 
0.258 



72 
45 
14 



0.176 


0,514 


51 


0,198 


0,467 


32 


0.150 


0,412 


40 


0,190 


0,330 


8 



u> 



o n - 
c O 



214 



Mastery Learning . Thirteen effect sizes were obtained in the area of 
mastery learning; the mean effect size was 0,644 with a standard deviation 
of 0.430. , Compared to other mean effect sizes reported in this meta- 
analysis, a mean effect size of 0.644 can be considered significant; how- 
ever, as studies which purport to investigate the effects of mastery learn- 
ing tended to remediate the experimental group on the basis of errors made 
by the participants on the outcome measure while the control group was not 
thus remediated, it is obvious that a large effect size should be 
obtained. 

In this meta-analysis, studies investigating the mastery learning 
system were obtained for the years 1971 (7 effect sizes), 1975 (1 effect 
size), and 1977 (5 effect sizes). The maximum mean effect size of 0.857 
with a standard deviation of 0.467 was obtained frTT97r and^the- minimum 
mean effect size of 0.368 with a standard deviation of 0.219 was obtained 
in 1977. Effect sizes originating from reports published in journals pro- 
duced a mean effect size (0.713 with a standard deviation of 0.500) which 
was almost double the mean effect size (0.488 with a standard deviation of 
0.161) reported at conferences. 

The subjects which were used as participants in studies purporting to 
evaluate the effectiveness of mastery learning systems in this meta-analysis 
ranged across grades 8, 11, and 12; the maximum effect size of 0.857 
was obtained in grade 8 and the minimum effect size of 0.368 was obtained 
in grade 11. The majority of the effect sizes included here originate 
from studies which utilized the random allocation of subjects to experi- 
mental groups, and such effect sizes have a mean of 0.742 with a standard 
deviation of 0.434 (based on 10 effect sizes). Studies which utilized the 
nonrandom allocation of subjects to experimental groups produced a mean 

ERIC „ 226 



effect sue of 0.210 with a standard deviation of 0.184 (based on 2 effect 
sizes). 

The curriculum areas addressed in this subsection are biology, chemistry, 
and physics. The maximum mean effect size of 0.857 was obtained in biology 
and the minimum mean effect size of 0.368 was obtained in chemistry. All 
effect sizes subsumed here are attributable to the immediate evaluation of 
experimental effects on conclusion of the experimental intervention. 

The mean effect size due to cognitive outcome criteria was 0.498 with 
a standard deviation of 0.278 (based on 8 effect sizes) and the mean effect 
size due to affective outcome criteria was 0.515 with a standard deviation 
of 0.446 (based on 2 effect sh'zes). Those studies which made use of pub- 
lished test materials in thefr evaluation of the experimental and control 
groups produced a mean effect size of 0.713 with a standard deviation of 
0.500 (based on 9 effect sizes) while those studies which utilized either 
modified published test materials or investigator-authored test materials 
produced a mean effect size^of^T:488 with a standard deviation of 0.161 
(based on 4 effect sizes). In those cases in which it was possible to 
calculate a mean effect size from raw data reported in the study itself, 
a mean effect size of 0.473 with a standard deviation of 0.194 (based on 
3 effect sizes) was produced; studies which reported their outcomes as a 
statistic or group of statistics generated a mean effect size by transfor- 
mation of those statistics of 0.857 with a standard deviation of 0.467 
(based on 7 effect sizes). 



{ 



9 

ERIC 



• Table 21 
MASTERY LEARNING 



A 

s.d, 

N 



0.644 
0,430 
13 





m 


J VUIIUQ 1 U 






A 


Deviation 


N 


By Year of Publication 








1971 


_0,857 


n dfi7 


7 


1975 


0.530 


n nnn 
u ♦ uuu 


1 


1977 


0,368 


0.219 


5 


By Form of Reporting 








Journal article 


0.713 


U , DUU 


Q 


Conference paper 


0,488 


U t 1 0 1 


A 


By Grade Level of Subjects 






8 


0,857 


0,467 


7 


11 


0,368 






12 


0,530 


n nnn 


l 
i 


By Validity of Design 








Random 


0.742 


0.434 


10 


Matched & Intact 








Random 


0,530 


0.000 


1 


Nonrandom 


0.210 


0.184 


2 


By Subject Matter 








Biology 


0,857 


0.467 


7 


Chemistry 


0.368 


0.219 


5 


Physics 


0,530 


0.000 ' 


1 



By Immediate or Retention 
Immediate 0 C 644 

By Type of Outcome Criterion 
Cognitive 0.498 
Affective 0.515 
Science Methods 1 .235 

Critical Thinking 0.890 



Standard 
Deviation 



0.430 



0.278 
0,446 
0.714 
0,000 



By Method .of Measurement (Instrument) 

Published 0.713 0.500 
Modified Published 

& Ad hoc 0.488 0.161 

By Calculation of Effect Size 

From raw data 0.473 0.194 

By direct calculation 0.857 0.467 

Less trustworthy 0.317 0.226 

By Source of Means 

Unadjusted posttest 0.368 0.219 

Covariance-adjusted 0.857 0.467 

Other 0.530 0.000 



13 



8 
2 
2 
1 



9 
4 



3 
7 
3 



5 
7 
1 



po > 



9 

ERIC 



22;) 



9 

ERIC 



2!7 



Media -Based Systems . Instructional systems based principally on the 
use of media (including film, television, and the like) as a means of 
inaugurating their effects gave rise to 100 effect sizes. The mean effect 
size in media-based systems was -0.023 with a standard deviation of 0.369. 
Within the media-based system, 40 effect sizes are attributable to tele- 
vision as a medium, and 58 are attributable to film as a medium; each of 
these will be reported in greater detail later. The mean effect size for ~* 
television based systems was 0.055 with a standard deviation of 0.347, and 
the mean effect size for film based systems was -0.065 with a standard 
deviation of 0.379. 

Studies of media-based instructional systems that were reported in the 
literature in the years from 1950 to 1973 are included here. The maximum 
mean effect size of 1.050 was derived from a study reported in 1952 and 
the minimum mean effect size of -0.558 was derived from studies reported 
in 1 962. Although considerable variability exists in the mean effect 
sizes broken down by year of reporting, no overall trend is apparent. The 
mean effect size derived from studies reported in journals was -0.005 with 
a standard deviation of 0.393 (based on 37 effect sizes) whereas the 
mean effect size derived from studies reported as dissertations was -0.012 
with a standard deviation of 0.370 (based on 47 effect sizes). Studies 
reported as unpublished articles produced a mean effect'size of -0.097 with 
a standard deviation of 0.32 (based on 16 effect sizes). Although a slight 
downwards trend is evident here, the magnitude of the standard deviations 
involved prevents any such claim from being substantiated. 

Subjects whose performances were evaluated in investigations of media- 
based systems were drawn from grades 1 through 12 with the bulk of the 
studies being performed at grades 5, 9, 11, and 12. The maximum mean 
effect size of 0.393 with a standard deviation of 0.012 (based on 3 effect 
sizes) was obtained at grade 1 and the minimum mean effect size of -0.262 



230 



218 



with a standard deviation of 0.284 (based on 25 effect sizes) was obtained 
at grade 12; no clear relationship between mean effect size and grade is 
evident in the data. 

Of the 100 effect sizes reported, 15 are attributable to studies which 
utilized a randomized, allocation of subjects to groups, 41 are attributable 
to studies which utilized matched allocation of subjects or random •alloca- 
tion of intact groups, and 44 are attributable to studies which utilized . 
nonrandom assignation. Studies which made use of the random allocation of 
subjects to groups produced a mean affect-size of -0,219 with a standard 
deviation of 0.443, studies which made use of matched or random allocation 
of intact groups produced a mean effect size of 0.071 with a standard devi- 
ation of 0.266, and studies which used a nonrandomized allocation produced 
a mean effect size of -0.044 with a standard deviation of 0,402. 

Effect sizes describing the media-based system were evolved from stud- 
ies'™ several curriculum areas (.general science, physical science, biology, 
chemistry, physics, and other). The maximum mean effect size of 0.149 with 
a standard deviation of 0.477 (based on 15 effect sizes) was obtained in the 
curriculum area of biology and the minimum effect size of -0.277 with a 
. standard deviation of 0.288 (based on 27 effect sizes) was obtained in the 
* area of physics. 

Of the 97 effect sizes for which information concerning the timing of 
outcome measurement was available, 85 addressed the question of immediate 
effects and 12 addressed the question of delayed effects. For those effect 
sizes which were based on immediate evaluation of experimental effects, the 
mean effect size was -0.009 with a standard deviation of 0.377, and for 
those effect sizes which were based on the delayed evaluation of experi- 
mental effects, the mean effect size was -0.113, with a standard deviation 



219 



of 0.347, illustrating an erosion of effect due to the media-based system 
relative to the effects created by traditional instruction over time. . 

In the case of effects evaluated as cognitive outcomes the mean effect 
size was -0,030 with a standard deviation of 0.388 (based on 75 effect 
sizes) and for the effects evaluated as affective outcomes the mean effect 
size was -0.104 with a standard deviation of 0.298 (based on 16 effect , 
sizes). Effect sizes derived from studies making use of published test 
materials revealed a lower mean effect size, (of -0.081) than effect sizes 
derived from studies making use of modified published test materials or 
investigatorrauthored test materials, these latter generating a mean effect 
size of 0.038. However, the magnitude of the standard deviation: involved 
prevents substantiation of this claim. Effect sizes making use of published 
test materials make up 51% of the total and effect sizes making use of 
modified published or ad hoc test materials make up 49$ of the total. Those 
effect sizes which were able to be calculated directly from raw data repor- 
ted in the studies gave a mean effect size of -0.080 with a standard devi- 
ation of 0.345 (based on 42 effect sizes) and those effect sizes which were 
able to be calculated by transformation of reported statistics gave a mean 
effect size of 0.055 with a standard deviation of 0.413 (based on 44 effect 
sizes). 



Tabla 22 
a MEDIA-BASED SYSTEMS 



. A.. =-0.023 
s.d. = 0.369 
N = 100 








C 4- -i v>A i^Nl 

otanaafxi 








o tauQaru 






A 

A 


ueviax ion 


M 
Vi 




A 


Uc V la 1 1 on 


M 


By Type of Media 








By Form of Reporting 








Television 


0, Odd 


n a~i 

0,34/ 


40 


Journal article 


-0.005 




o / 


M im 


~U,0bb 


o, o/y 


CO 

bo 


Dissertation 


-0.012 


U , of U 


4 / 


5 1 ides 


n 

-0,4/0 


0,000 


1 


Unpubl ished article 


-0.097 


n ^o n 


1 

1 O 


Tapes 


-0,2/0 


0,000 


1 


















By Grade Level of Subjects 






by Year of Publication 








1 


0.393 


n m 9 

U ,U 1 c 


o 
j 


i you 


n o cn 


0,014 


o 

. c 


2 


-0.253 




o 


1 QC1 

i yo i 


0,o/0 


f\ A QC 

o,4yo 


o 

L 




U . U JO 


n ^9A 


c 

0 


1 OCO 

1 yb^ 


1 ^UdU 


0, 000 


1 


4 • 


-0.007 


U , 1 o/ 


c 

D 


1956 


0,03b 


0,050 




5 


0,130 


H 99Q 


1 1 


1 OCT 

1 957 


n no c 

0,025 


0,050 




6 


-0.190 


U ,00 1 


7 


inert 

1 959 


-0,1 94 


0*334 


8 


7 


0.180 


U,UUU 


o 
C 


1 960 


0,069 


0,161 


/ 


8 


0.045 


n ion 


o 
C 


1961 


-0.198 


0,231 


19 


9 


0.116 


0.160 


n 


1962 


-0.558 


0,195 


4 


10 


0.390 


0.445 


6 


1963 


-0,183 


0,320 


4 


11 


0.007 


0.327 


17 


1964 


0.117 


0,221 


,6 


12 


-0.262 


0.284 


25 


1 JUO 




U , 1 uu 


Cm 










1968 


/ 0.208 


0.310 


6 


By Val idity of Design 








1969 


0,225 


0,149 


6 


Random 


-0.219 


f\ A A O 

0,443 


1 b 


1970 


-0.387 


0.618 


3 


Matched & Intact 








1971 


0.028 


0.366 


19 


Random 


0.071 


0,266 


41 


1972 


-0.124 


0.249 


5 


Nonrandom 


-0.044 


0,402 


44 


1973 


-0,130 


0.481 


2 
















(continued) 










23 o 




C 













O 



Media-Based Systems^continued 





- 


Standard 






A 


Deviation 


H 


By Subject Matter 








General Science 


0.066 


0.328 


36 


Physical Science 


0.096 


0.159 


5 


Biology 


0.149 


• 0.477 


15 


Chemistry 


-0.009 


0.324 


15 


Physics 


-0.277 


0.288 


27 


Other 


0.125 


0,460 


2 


By Immediate or Retention 






Immediate 


-0,009 


0.377 


85 


Retention 


-0.133 


0,347 


12 


By Type of Outcome Criterion 






Cognitive 


-0,030 


0.388 


75 


Affective 


-0.104 


0.298 


16 


Science Methods 


0.118 


0.143 


5 


Psychomotor 


•-0.080 


0.000 


1 


Critical Thinking 


0.160 


0.014 


2 


Creativity 


0.770 


0,000 


1 


By Method of Measurement (Instrument) 




Published 


-0,081 


0,351 


51 


Modified published 








& ad hoc 


0,038 


0.381 


49 



By Calculation of Effect Size 
From rciw data ^Ck080 
By direct calculation 0,0&5- 
Less trustworthy -0,097 

By Source of Means 
Unadjusted posttest 
Covariance adjusted 
Pre-post differences 
Other 



Standard 
Deviation 



0,345 
0.413 
^244 



42 
44 
14 



-0.042 


0.393 


40 


-0.048 


0.279 


-26 


0.071 


0.416 


28 


-0.225 


0.250 


6 



<■>•;• : 
u. t • J 



2:v, 



o 

ERIC 



222 



... Television ... Of the 100 effect sizes summarized previously under_the. 

heading of "Media-Based Instruction," 40 made use of television as the 
medijum "of instruction. It is these 40 effect sizes that will be dealt with 
here. Television-based instruction systems produced a mean effect size of , 
0.055 with a standard deviation of 0.347 and were reported in the years 
between 1957 and 1971. The mean effect size for studies reporting their 
outcomes in journals was 0.110 with a standard deviation of 0.194 (based 
on JO effect sizes) and the mean effect size derived from studies reported 
as dissertations was 0.026 with a standard deviation of 0.411 (based on 26 
effect sizes)— il lustrating again the trend towards higher effects being 
reported in journals than in dissertations. . Studies evaluating this sys- 
tem were performed at grades 1 through 9, and grade 12, with no substantial 
trend being apparent across the grades. Studies which made use of the 
randomized allocation of subjects to groups produced a mean effect size of 
0.285 with a standard deviation of 0.686 (based ,on 2 effect sizes), the mean 
effect size derived from the matched allocation of subjects to groups or 
the random allocation of intact groups to treatment groups was 0.086 with 
a standard deviation of 0.287 (based on 34 effect sizes), and the mean 
effect size generated in studies which utilized the nonrandom allocation 
of subjects to experimental groups was -0.320 with a standard deviation 
of 0.522 (based on 4 effect sizes). 

The curriculum areas of general science, physical science, biology, 
and physics formed the bodies of scientific expertise which were utilized 
in the evaluation of this system. Both general science and .physical science 
generated positive but small mean effect sizes (0.092 and 0.096 respective- 
ly), whereas, biology and physics generated negative mean effect sizes 
(.-0.049 and -0.160 respectively). All effect sizes in this subsection 



ERIC 



23/ 



223 



derived, from the immediate evaluation of experimental effects on the conclu 
sion of the experiment. 

The mean effect size for cognitive outcomes was 0.022 with a standard 
deviation of 0.355 (based on 33 effect sizes) and these effect sizes con- 
stituted the bulk of the effect sizes apparent in this subsection of the 
meta-analysis. In those studies which made use of published test materials 
7 effect sizes were generated with a mean effect size of 0.020 and a stand- 
ard deviation of 0.119. All other studies in this area made use ei'ther of 
modified published test materials or investigator-authored test materials, 
and registered a mean effect size of 0.063 with a standard deviation of 
0.379 (based on 33 effect sizes). Those effect sizes which were able to 
be calculated from raw data reported in the studies themselves produced a 
mean effect size of 0.018 with a standard deviation of 0.428 (12 effect 
sizes); of the remaining 28 effect sizes, 23 were produced by transforma- 
tion of reported statistics and this group gave rise to a mean effect size 
of 0.066 with a standard deviation of 0.341. 



Standard 
A " Deviation 



B y Year of Publication 

1957 0.060 0.000 

1960 0.110 0.157 

1964 0.090 0.115 

1968 0.208 0.310 

1969 0,205 0.188 

1970 -0.387 0.618 

1971 0.028 0.366 

By Form of Reporting 

Journal article 0.110 0.194 

- ' Dissertation 0,026" 0.411 

Unpublished article 0,110 0,157 

By Grade Lev el of Subjects 

— I 0~.393 0,012 

2 -0.253 , 0.280 

3 0.058 0,524 

4 -0.007 0.187 

5 0.197 0.135 

6 -0.118 0.666 

7 0.180 0.000 

8 0.045 0.120 

9 0.090 0.143 
12 • -0.120 0.000 

By Validity of Design 

> Random 0.285 0.686 
, . _ Matched & Intact 

t ( 23 J Random 0.086 0.287 

O Nonrandom -0,320 0.522 
FRIT" 



Table 23 
riSION INSTRUCTION 

. A = 0..055 

S.d. = 0,347 
N = 40 





*• 


Standard 




N 


A 


Deviation 


hi 

N_ 




By Subject Matter 






1 


General Science 0,092 


0.342 


26 


4 


Physical Science 0,096 


0,159 


5 


3 


Biology -0.049 


0.495 


7 


6 


Physics. . -0,160 


0,057 


2 


4 

3 


By Immediate or Retention 






19 


Immediate 0.055 


0.347 


40 




By Type of Outcome Criterion 






10 


Cognitive 0.022 


0.355 


33 


'26 

Cm \J 


Affective -0.120 


0.000 


1 


4 


Science Methods 0.173 


0.087 


4 




Critical Thinking 0.150 


0,000 


1 




Creativity 0,770 


0.000 


1 


3 
3 


By Method of Measurement (Instrument) 




6 


Published 0.020 


0.119 


7 


6 


Modified published 






7 


& Ad hoc 0.063 


0.379' 


33 


5 
1 


By Calculation of Effect Size 






2 


From raw data 0.018 


0.428 


12 


6 


By direct calculation 0,066 


0.341 


23 


1 


Less trustworthy 0.096 


0.159 


5 




By Source of Means 






2 


Unadjusted posttest 0.018 


0.428 


12 




Covariance adjusted 0.144 


0.171 


9 


34 


Pre-post differences 0.041 


0.372 


18 


4 


Other -0.040 


0.000 


1 



ho 
4^ 



2^U 



225 : 

. Film Based Instruction ., Studies evaluated under the film based, instruc- 
tion subsection of the meta-analysis generated a total of 58 effect sizes 
with a mean effect size of -0.065 and a standard deviation of 0.378. The 
effect sizes were derived from studies reported between the years of 1950 
and 1973 with the maximum mean effect size occurring in 1952 (1.OS0) and 
the minimum mean effect size occurring in 1962 (-0.558)— no obvious trend 
is apparent in the data. The mean effect size for studies reported in 
journals was -0.047 with a standard deviation of 0.440 (based on 27 effect 
sizes) whereas the mean effect size for studies reported in dissertations 
was -0.026 with a standard deviation of 0.311 (based on 19 effect sizes), 
reversing the trend apparent in other sections of the meta-analysis. 

The subjects who formed the basis of the experimental and control groups 
in the evaluation of the film based instructional system were drawn from 
grades 5, 7, and 9 through 12, with the bulk of the effect sizes being 
obtained in grades 11 and 12. The minimum mean effect size of -0.2fr8 with 
a standard deviation of 0.288 (based on 24 effect sizes) was obtained in 
grade 12 and the maximum mean effect size of 0.390 with a standard devia- 
tion of 0.445 (based on 6 effect sizes) was obtained in grade 10; no 
obvious trend is apparent in the data. 

Studies whose groups were generated by random allocation of subjects 
produced a mean effect size of -0.283 with a standard deviation of 0.407 
(based on 11 effect sizes) whereas those studies which utilized matched 
allocation of subjects or the random allocation of intact groups produced 
a mean effect size of 0.000 with a standard deviation of 0.107 (based on 7 
effect sizes). The remaining 40 effect sizes were produced in studies which 
utilized a nonrandom allocation procudure and these effect sizes have a mean 
of -0.016 with a standard deviation of 0.385. 

24± 



226 

Curriculum areas addressed under the heading of film-based instruction 
were general science, biology, chemistry, physics, and other, with the 
minimum mean effect size 0/ -0.287 with a standard deviation of 0.298 . 
(based on 25 effect sizes) occurring in physics, and the maximum mean 
effect size of 0,323 with a standard deviation of 0.414 (based on 8 effect 
sizes) occurring in biology. Of the 58 effect sizes appertaining to film- 
based instructional systems, 55 possessed codings as to the immediate or 
delayed nature of their effects; the remaining 3 effect sizes were uncoded 
on this variable. Forty-three effect sizes were derived from the immediate 
evaluation of educational outcomes and these immediate effect sizes gave 
rise to a mean effect size of -0.051 with a standard deviation of 0.399, 
whereas the mean effect size based on delayed measurement of educational 
outcomes was -0.133 with a standard deviation of 0.347 (based on 12 
effect sizes). „ }' . . 

The mean effect size for fflm-based instructional systems based on 
cognitive outcome criteria was -0.055 with a standard deviation of, 0.416 
(based on 40 effect sizes) and the mean effect size based on affective 
outcome criteria was -0.103 with a standard deviation of 0.309 (based on 
15 Effect sizes). Those studies which made use of published test materi- 
als in their evaluation of treatment effects generated a mean effect size 
of -0.O84 with a standard deviation of 0.377 (based on 42 effect sizes), 
while all other effect sizes in this subsection made use of either modified 
published test materials or investigator-authored test materials and gener- 
ated a mean effect size of -0.014 with a standard deviation of 0.390. Of 
the 58 effect sizes, 28 were obtained from raw data contained in the studies 
themselves and the mean effect size thus obtained was -0.101 with a stand- 
ard deviation of 0.308. The mean effect size obtained from the 21 effect 



227 



sizes derived from studies' which reported their outcomes as a statistic 
or group of statistics was 0.044 with a standard deviation of 0.488. 



( 



ERIC \ o - 

* <*<io 



9 



Standard 

A Deviation 

By Year of Publication 

1950 0.250 0.014 

1951 0.870 Q.495 

1952 1.050 0.000 
.1956 0.035 0.050 

1957 -0.010 0.000 

1959 -0.194 0.334 

1960 0.013 0.180 

1961 -0.198 0.231 

1962 -0.558 0.195 

1963 -0.183 . 0.320 
. 1964 0.143 0.327 

1966 0.155 0.106 

1969 0.265 0.021 

1972 0.040 0.114 

. 1973 -0.130 0.481 

By Form of Reporting 

Journal article -0.047 0.440 

Dissertation -0.026 0.311 

O'npubl ished article -0.166 0.335 

By Grade Level of Subjects 

5 0.013 0.329 

1 0.180 0.000 

9 0.146 0.190 

10 0.390 0.445 

11 0.007 0.327 
24- ± 12 -0.268 0.288 



Table 24 
BASED INSTRUCTION 



A = -0.065 
s.d. = 0.378 
N = 58 









Standard 




N 




A 


Deviation 


N 




By Validity of Design 








2 


Random 


-0.283 


0.407 


11 


2 


Hatched & Intact 








1 


Random 


0.000 


0.107 


7 


2 


Nonrandom 


-0.016 


0.385 


40 


1 

8 


By Subject Matter 








3 


General Science 


0.090 


0.244 


8 


19 


Biology 


0.323 


0.414 


8 


4 


Chemistry 


-0.009 


0.324 


15 


4 


Physics 


-0.287 


0.298 


25 


3 


Ot her - 


0.125 


0.460 


2 


2 
2 


By Immediate or Retention 






3 


Immediate 


-0.051 


0.399 


<3 


2 


Retention 


-0J33 


0.347 


12 




By Type of Outcome Criterion 






27 


Cognitive 


-0.055 


0.416 


40 


19 


\ Affective 


-0.103 


0.309 


15 


12 


Science Methods 


-0.100 


0.000 


1 




Psychomotor 


-0.080 


0.000 


1 


4 


Critical Thinking 


0.170 


0.000 


1 


1 


By Method of Measurement (Instrument) 




5 


Published 


-0.084 


0.377 


42 


6 


Modified published 








17 


& Ad hoc 


-0.014 


0.390 


16 



24 - 

(continued) 

246 



Film Based Instruction, continued 



By Calculation of Effect Sizes 
From raw data -0.101 
By direct calculation 0.044 
Less trustworthy -0.204 



Standard 
Deviation 



0.308 
0,488 
0.219 



28 
21 
9 



By Source of Means 

Unadjusted posttest -0.044 

Covariance adjusted -0.149 

Pre-r,ost differences 0.125 

Other -0.262 



Standard 
Deviation 



0.386 
0.274 
0.503 
0.260 



O4 > 



O A • « 
a, 's I 



*230 



Personalized System of Instruction . The studies assessed in this meta- 
analysis which purported to evaluate the efficacy of the personalized system 
of instruction had a mean effect size of 0.603 with a standard deviation of 
0.423 and there were 15 effect sizes in all. The studies appropriate to 
this system were reported in the years 1 971 , 1974, and 1 977, and in each of 
the years the mean effect sizes were 0.857, 0.403, and 0.368 respectively. 

Effect- sizes generated- from studies reported in journals gave rise to 

a mean effect size of 0.713 with a standard deviation of 0.500 (based on 9 
effect sizes) and those derived from studies reported as unpublished arti- 
cles gave rise to a mean effect size of 0.403 with a standard deviation of 
0.280 (based on 3 effect sizes). The mean effect size derived from studies 
reported at conferences was 0.473 with a standard deviation of 0.194 (based 
on 3 effect sizas). This .supports the trend evidenced earlier that mean 
effect sizes reported in journal articles tend, on the whole, to be larger 
than those reported elsewhere. Studies pertaining to the personalized sys- 
tem of instruction were carried out in grades 5, 8, and 11, and the maximum 
mean effect size of 0.857 was obtained in grade 8. H 

Studies which utilized the random allocation of subjects to experimental 
and control groups generated a mean effect size of 0.742 with a standard 
deviation of 0.434 (based on 10 effect sizes), studies which utilized the 
matched allocation or the random allocation of intact groups generated a 
mean effect size of 0.403 with a standard deviation of 0.280 (based on 3 
effect sizes), and studies which made use of the nonr=\ndom allocation of 
subjects to groups generated a mean effect size of 0.210 with a standard 
deviation of 0.184 .(based on 2 effect sizes). The subject matter areas of 
general science, biology, and chemistry were used as curriculum areas in 
the evaluation of the personalized system of instruction, and the maximum 



231 

effect size of '0.857 was obtained in biology. All 15 of the effect sizes 
in this subsection were generated by the evaluation of educational outcomes 
immediately on completion of the interventions. 

The mean value of effect sizes generated by the use of cognitive out- 
come criteria was 0.493 with a standard deviation of 0.300 (based on 7 
effect sizes) and the mean value of effect sizes generated by the use of 
affective outcome criteria was 0,515 with a standard deviation of 0.446 
(based on 2 effect sizes}. 

Studies which made .use of published test materials in their evaluation 
of experimental and control group effects produced a mean effect size of 
0.713 with a standard deviation of 0.500 (based on 9 effect sizes) while 
studies which made use of modified published test materials or investigator- 
authored materials produced a mean effect size of 0.438 with a standard 
deviation of 0.219 (based on 6 effect sizes). In the case of the 6 effect 
sizes which were calculated directly from raw data reported in the studies, 
the mean value was 0.438 with a standard deviation of 0.219, whereas the 
mean value of the 7 effect sizes obtained by transformation of reported 
statistics was 0.857 with a standard deviation of 0.467. 



Table 25 

PERSONALIZED SYSTEM OF INSTRUCTION (PSI) 

A = 0.603 
s.d. 0.423 
N = 15 



By Year of Pu blication 
19?1 
1974 
1977 

By Form of Reporting 
Journal article 
Unpublished article 
Conference paper 



5 
8 
11 

By Validity of Design 
Random 

Matched & Intact 

Random 
Nonrandom 





Standard 






Deviation 




0.857 


0.467 


7 


0.403 


0.280 


3 


0.368 


0.219 


5 


n 71 ^ 


0 500 


q 

-* 


0.403 


0.280 


3 


0.473 

:s 


0.194 


3 


0.403 


0.280 


' 3 


0.857 


0.467 


7 


0.368 


0.219 


5 


0.742 


0.434 


10 


0.403 


0.280 


3 


0.210 


0.184 


2 



By Type of Outcome Criterion 
Cognitive 0.493 
Affective 0.515 
Science Methods - 1 .235 
Critical Thinking 0.890 
Logical Thinking 0.403 



Standard 
Deviation 



0.300 
0.446 
0./14 
0.000 
0.280 



By Method of Measurement (Instrument) 

Published 0.713 0.500 
Modified Published 

& Ad hoc 0.438 0.219 



By Calculation of Effect Size 

From raw data 0.438 

By direct calculation 0.857 

Less trustworthy 0.210 

By Source of Means 

Unadjusted posttest 0.381 
Covariance adjusted 0,857 



0.219 
0.467 
0.184 



0.224 
0.467 



7 
2 
2 
1 
3 



9 
6 



6 
7 

2 



8 
7 



CO 



By Subject Hatter 
General Science 
Biology 
Chemistry 

By Immediate or Retention 
Immediate 



0.403 


0.280 


3 


0.857 


0.467 


7 


0.368 


0.219 


5 


i 

0.603 


0.423 


15 



ERIC 



2;')-) 



/w < > -A- 



233 



Programmed Instructio n, The 52 effect sizes which were collected under 
the umbrella of programmed instruction had a mean value of 0.174 with a 
standard deviation of 0*475* Studies appropriate to this area were reported 
in the years between 1961 and 1973 inclusive, and the mean effect sizes range 
from -0*200 in 1967 to 0.806 in 1951; no obvious trend is apparent in the 
data. The pattern recognized earlier concerning the mean effect sizes der- 
ived from studies reported in journals as opposed to studies reported in 
dissertations is repeated here. For effect sizes derived from jourpals, 
the mean effect size ^v/as .301 with a standard deviation of 0.448 (based 
on 7 effect sizes) while the mean effect size derived from studies reported 
in dissertations was 0.154 with a standard deviation of 0.480 (based on 
45 effect sizes). Effect sizes were obtained from grades 4 and 6 through 
12 with the maximum mean effect size of 1*07 occurring in grade 12 and the 
minimum mean effect size of -0.415 occurring in grade 8* 

Studies which made use of random allocation of subjects to experimental 
and control groups produced a mean effect size of 0*173 with a standard 
deviation of 0.413 (based on 15 effect sizes), effect sizes derived from 
studies which made use of matched allocation or the random allocation of 
intact groups gave rise to a mean effect size of 0.186 with a standard 
deviation of 0.467 (based on 31 effect sizes), and studies which made use 
of nonrandom assignation gave rise to a mean effect size of 0.113, with a 
standard deviation of 0.710 (based on 6 effect sizes). General science, 
life science, physical science, biology, chemistry, and physics were the 
curriculum areas addressed under this system* The minimum mean effect 
size of -0.065 was obtained in general science and the maximum mean effect 
size of 0.533 was obtained in physics. 

Of the 52 effect sizes reported in this area, 40 addressed the question 

or ) 



234 

* 

of immediate experimental effects and these immediate effect sizes had a 
mean value of 0.260 with a standard deviation of 0.497. Eight effect 
sizes addressed the question of delayed experimental effects, and these 
effect sizes had a mean value of -0.113 with a standard deviation of 0.276, 
thus supporting the trend evidenced earlier in other subsections of the 
meta-analysis. 

Studies which made use of published test materials gave rise to a mean 
effect size of 0.258 with a standard deviation of 0.394 (based on 10 effect 
sizes) whereas studies utilizing modified published test materials or 
investigator-authored test materials had a mean effect size of 0.154 with 
a standard deviation of 0.494 (based on 42 effect sizes). In those cases 
in which cognitive outcome criteria were used, the mean effect size was 
0.173 with a standard deviation of 0.479 (based on 51 effect sizes). The 
mean effect size obtained from studies which reported their raw data was 
0.173 with a standard deviation of 0.485 (based on 43 effect sizes) while 
the mean effect size obtained from those studies which reported their 
outcomes as one or more common statistics was 0.373 with a standard devi- 
ation of 0.420 (based on 6 effect sizes). * 




or : 



0 



PRC 



Standard 

A Deviation 

By Year of Publication 

196,1 0.806 0.438 

1963 0,190 0,651 

1964 0.403 0.195 

1965 0.047 0.133 

1966 0.040 0.236 

1967 -0.200 0.400 

1968 0.088 0.494 

1969 0.265 0.021 

1970 -0.046 0.495 

1971 0.013 0.310 

1972 0.767 0.430 

1973 0.173 0.780 

By Form of Reporting 

Journal article 0.301 0.448 

Dissertation 0.154 0.480 

By Grade Level of Subjects 

""/J ^0.030 0.014 

6 -0.070 0.521 

7 0.023 0.342 

8 -0.415 0.205 

9 0.21 6 0.207 - 

I 0 0.253 0.276 

II . 0.270 0.570 
12 1 .070 0.000 

By Validity of Design 

Random 0,173 0.413 
' a w~ Matched & Intact 

FRIT'';* Random 0.186 0.467 

Nonrandom 0.113 0.710 



Table 26 
[AMMED INSTRUCTION 

A = 0.174 
s.d. = 0.475 * 
N = 52 



* 






Standard 




N 




A 


Deviation 


N, 




By Subject Matter 








c 

J 


General Science 


-0.065 


U « j4l 


1 u 


2 


Life Science 


0.430 


0.000 


i 


3 


Physical Science 


0.148 


0.161 


4 


3 


Biology 


0.055 


0.424 


12 


7 


Chemistry 


0.291 


U • j j U 


99 


3 
6 


Physics 


0.533 


0.5V6 


3 


2 


By Immediate or Retention 






8 


Immediate 


0.260 


0.497 


40 


7 


Retention 


-0,113 


0.276 


Q 
O 


3 
3 


By Type of Outcome Criterion 








Cognitive 


0.173 


0.479 


51 




Affective . 


0.200 


0.000 


1 


7 
45 


By Method of Measurement (Instrument) 






Publ i shed 


0.258 


0.394 


10 




Modified published 








2 


& ad hoc 


0,154 


0.494 


42 


7 
4 


By Calculation of Effect Size 






2 


From raw data 


0.173 


0.485 


43 


5 


By direct calculation 0.373 


0.420 


6 


11 


Less trustworthy 


-0.207 


0.140 


3 


20 










1 


By Source of Means 










Unadjusted posttest 


0.242 


0.495 


30 




Covariance adjusted 


-0.003 


0.477 


3 


15 


Pre-post differences 


0.095 


0.446 


19 



ro 

Ui 



31 

6 



Or- . 



236 



Branched Programmed Instruction . The branched programmed instructional 
system gave rise to 5 effect sizes with a mean effect size of 0.210 and a 
standard deviation of 0.798. The studies were reported in the years 1968, 
1971, and 1972, with the maximum mean effect size of 1 .230 being derived 
in the year 1972 and the minimum mean effect size of -0.400 being obtained 
in 1971. Again, we note that the mean effect size derived from journal 
entries (1.230) was larger than the mean effect size derived from studies 
reported Jn dissertations (-0.045) . 

Studies which made use of the random allocation of subjects to experi- 
mental and control groups produced a mean effect size of 0.143 with a 
standard deviation of 0.941 (based on 3 effect sizes), and the mean effect 
size obtained from studies which made use of either the matched allocation 
of subjects or the random allocation of intact groups of subjects was 0.310 
with a standard deviation of 0.863 [based on 2 effect sizes). All effect 
sizes were obtained in the curriculum area of chemistry and were based on 
cognitive outcome measures, although 3 effect sizes were appropriate to the 
immediate evaluation of intervention effects and had a mean of 0.590 with a 
standard deviation of 0.854 and the remaining 2 effect sizes addressed the 
question of delayed effects and had a mean of -0.360 with a standard devi- 
ation of 0.085. 



Table 27 

BRANCHED PROGRAMMED INSTRUCTION 



By Year of Publication 

1968 0,310 

1971 -0.400 

1972 1 .230 

By Form of Reporting 

Oourna I article 1.230 
Dissertation ' -0.045 



By Grade Level of Subjects 



TI 

By Validity of Design 
Random 

Matched & Intact 
Random 

8 y Subject Matter 
Chemistry 



0.210 

0.143 
0.310 

0.210 



By Immediate or Retention 
Immediate 0.590 
Retention -0.360 

By Type of Outcome- Criterion 
Cognitive 0.210 



Standard 
Deviation 



0.863 
- 0.028 
O.'OOO 



0.000 
0.645 



0.798 

0.941 
0.863 

0.798 



0.854 
0.085 



0.798 



a = 0.210 
s.d. = 0.798 
N = 5 



2 
2 
1 



1 

4 



3 
2 



3 
2 



Standard 
A Deviation 

By Method of Measurement (Instruments) 
Modified published & 
ad hoc * 0.210 0.798 



By Calculation of Effect Size 
From raw data 0.2T0 



By Source of Means 

Unadjusted posttest -0.400 
Pre-post differences 0.617 



0.798 



0.028 
0.809 



2 
3 



or 




or / 



238 

I 

Linear Programmed Instruction , Forty-seven effect sizes were obtained 
in this area and had a mean of 0.170 with a standard deviation of 0.441. 
The studies from which these effect sizes were drawn were reported between 
1961 and 1973 and the mean effect sizes ranged in magnitude from -0.200 in 
1967 to 0.806 in 1961, with no obvious trend being apparent in the data. • 
Effect sizes derived fr.om studies reported in journal articles had a mean 
effect size of 0.147 with a standard deviation of 0.199 (based on 6 effect 
sizes) whereat the mean effect size derived from studies reported in disser- 
tations had a mean^ effect size of 0.173 with a standard deviation of 0.467 
Cbased on 41 effect sizes). ? The samples of subjects which were used in 
the evaluation of linear programmed instructional systems were drawn from 
grades 4 and 6 through 12, with the^minimum mean effect size of -0.415 being 
obtained in grade 8 and the maximum mean effect size of 1.070 being obtained 
in grade 12; no obvious trend is apparent in the data. 

Studies which made use of the random allocation of subjects to groups 
generated a mean effect size of 0.180 with a standard deviation of 0.236 
(based on 12 effect sizes), studies which made use of the matched ^loca- 
tion or the random allocation of intact groups generated a mean effect size 
of 0.178 with a standard deviation of 0.454 (based on 29 effect sizes) ana 
studies which utilized a nonrandom assignation process generated a mean 
effect size of 0.113 with a standard deviation of 0.710 (based on 6 effect 
sizes). 

In all, 6 separate curriculum areas were utilized in 'the evaluation of 
the effectiveness of linear programmed instructional systems, with the 
maximum mean effect size of 0.533 occurring in physics and the minimum mean 
effect size of -O.065 occurring in general science. Thirty-seven effect 
sizes addressed the question of immediate evaluation of experimental and 



239 



control group effects, and had a mean effect size of 0.234 with a standard 
deviation of 0.467 and 6 effect sizes addressed the question of delayed 
intervention effects and had a mean effect size of -0.030 with a standard 
deviation of 0.269. Forty-six out of the 47 effect sizes made use of 
cognitive outcome criteria and had a mean effect size of 0.169 and a 
standard deviation of 0.446. The mean effect size obtained from studies 
which made use of published test materials was 0.258 with a standard devi- 
ation of 0.394 (based on 10 effect sizes) and the mean effect size obtained 
from studies which made use of either modified published test materials or 
investigator-authored test materials was 0.146 with a standard deviation 
of 0.455 (based on 37 effect sizes). 



ERIC 



LINEAR 



Table 28 
PROGRAMMED INSTRUCTION 



A = 0.170 
s.d. = 0,441 
N = 47 



\ 



By Year of Publication 

1961 0,806 

1963 0.190 

1964 0.403 

1965 0.047 

1966 -0.040 

1967 -0.200 

1968 -0.023 

1969 0.265 

1970 -0.046 

1971 0.178 

1972 0.5.35 

1973 0.173 



Standard 
Deviation 



0.438 
0.651 
0.195 
0.133 
0.236 
0.400 
0.331 
0.021 
0,495 
0.158 
0.219 
0.780 



5 
2 
3 
3 
7 
3 
4 
2 
8 
5 
2 
3 



Journal article 


0.147 


0.199 


6 


Dissertation 


0.173 


0.467 


41 


By Grade Level of Subjects 






4 


-0.030 


0.014 


2 


6 


-0.070 


0.521 


7 


7 


0.023 


0.342 


4 


8 


-0.415 


0.205 


2 


9 


0.216 


0.207 


5 


10 


0.253 


0.276 


11 


11 


0.290 


0.508 


15 


12 


1 .070 


o.OOO 


1 


JL 









(continued) 





A 
Li 


o LailUa rU 






uevia t l on 


M 


By Validity of Design 








Random 


0,180 


0.236 


12 


Matched & Intact 








Random 


0.178 


0.454 


29 


Nonrandom 


0,113 


0 710 


u 


By Subject Matter 








General Science 


-0.065 


0 342 


1 0 


Life Science 


0.430 


O 000 


1 


Physical Science 


0.148 


0.161 


4 


Biology 


0.055 


0.424 


1 2 


Chemistry 


0.315 


0.485 


1 7 


Physics 


0.533 


0.516 


3 


By Immediate or Retention 






Immediate 


0.234 


0.467 


37 


Retention 


-0.030 


0.269 


6 


By Type of Outcome Criterion 






Co gnitive 


0.169 


0.446 


46 


Affective 


0.200 


0.000 


1 


By Method of Measurement (Instruments) 




Published 


0.258 


0.394 


10 


Modified published 








& ad hoc 


0.146 


0.455 


37 



o 



ERIC 



Linear Programmed Instruction, continued 



A 



From Calculation of Effect Siz e 
From raw data 0.168 
By direct calculation 0.373 
Less trustworthy -0.207 



Standard 
Deviation 



0.445 
0,420 
0.140 



38 
o 
3 



By Source of Means 

Unadjusted posttest 0.288 

Covariance adjusted -0.003 

Pre-post difference -0.003 



Standard 
Deviation 



0.480 
0.477 
0.295 



28 
3 

16 



2f< 



ERIC 



242 

Self-Directed Study . Twenty-seven effect sizes were obtained from 
studies which purported to investigate the effects of self-directed study; 
these effect sizes had a mean value of 0.078 and a standard deviation of 
0.375. The studies were reported in the years between 1968 and 1975, with 
the minimum mean effect size of -CU310 being derived from studies in 1971 
and the maximum mean effect size originating from studies reported in 1975 
(0.507). The trend concerning the relative magnitudes of mean effect sizes 
derived^from studies reported in journals and dissertations, which has been 
referred to earlier in other areas of this meta-analysis, is again evidenced 
here; the mean effect size derived from studies reported in journals was 
0.138 with a standard deviation of 0.54? (based on 4 effect sizes) whereas 
the mean effect size derived from studies reported in dissertations was 
-0.010 with a standard deviation of 0.328 (based on 19 effect sizes). 
Subjects who participated in the studies summarized here were drawn from 
grades 4, 5, 7, and 9 through 12, with the minimum mean effect size of 
-0.185 being obtained in grade 11 and the maximum mean effect size of 0.500 
being obtained in grade 10; no obvious trend is apparent in the data. 

Twelve of the 27 effect sizes considered here derived from studies which 
made use of the random allocation of subjects to experimental and control 
groups, and the mean effect size thus obtained was 0.107 with a standard 
deviation of 0.436. The , remaining 15 effect sizes were obtained from stud- 
ies which made use of either the matched allocation of subjects to experi- 
mental groups or the random allocation of intact groups, and the mean effect 
size in this case was 0.055 with a standard deviation of 0.334. 

The curriculum areas of general science, biology, earth science, chemis- 
try, and physics were used as content areas in the studies appropriate to 
this system. The minimum mean effect size of -0.047 was obtained in 



243 

chemistry and the maximum mean effect size of 0,200 was obtained in general 
science. Twenty of the effect sizes collected here were intended to eval- 
uate treatment effects immediately on conclusion of the intervention, and 
these effect sizes had a mean value of 0.095 with a standard deviation of 
0.396. The mean effect size for delayed effects was -0.050 with a standard 
deviation of 0.523 (based on 2 effect sizes). Five effect sizes were 
uncoded on this variable. 

In studies which made use of published test materials in order to 
evaluate the outcomes of the investigation, the mean effect size was 0.088 
with a standard deviation of 0.392 Cbased on 16 effect sizes) while studies 
which made use of modified published test materials or i nvestigator-authored 
test materials produced a mean effect size of 0.065 with a standard devia- 
tion of 0.368 (based on 11 effect sizes). The mean effect size for cognitive 
outcome criteria was -0.018 with a standard deviation of 0.341 (based on 16 
effect sizes) and the mean effect size for affective outcome criteria was 
-0.0*97 with a standard deviation of 0.458 (based on 3 effect sizes). In 
those cases in which it was possible to generate effect sizes directly from 
reported raw data, the mean effect size thus obtained was 0.079 with a 
standard deviation of 0.348 (based on 20 effect sizes), in the case of 
effect sizes generated by transformation of reported statistics the mean 
effect size obtained was 0.495 with a standard deviation of 0,530 (based 
on 2 effect sizes) . 



A 



Standard 
Deviation 



By Y ear of Publ i cation 

1968 0.050 0.1 40 

1970 0.505 0.262 

1971 -0.310 0.208 

1972 -0.067 0.271 

1973 -0.275 0.1 06 

1974 0.282 0.325 
" 1975 0.507 0.376 

By Form of Reporting 

Journal article 0.138 0.544 

Dissertation -0.010 0.328 

Unpublished article 0.403 0.280 

Conference paper 0.530 0.000 

By Grade Level of Subjects 

"/J 0".040 0.000 

5 , 0.403 0.280 

7 0.050 0.1 40 

9 0.008 0.371 

10 ' 0.500 0.342 

11 -0.185 0.208 
Xi 0.050 0.570 

By Va lidity of Design 

Random 0.107 0.436 
Matched & Intact 

Random 0.055 0.334 



Table 29 
-DIRECTED STUDY 



A = 0.078 
s.d. = 0.375 
N = 27 



By Subject Matter 

3 General Science 
2 Biology 

4 Earth Science 
7 Chemistry 

2 Physics 
6 
3 



4 

19 
3 
1 



1 

3 
3 
5 
4 
8 
3 



12 
15 

(continued) 



Standard 
Deviation 



0.200 


0.263 


0.172 


0.479 


0.000 


0.000 


-0.047 


0.355 


0,050 


0.570 



Immediate 


0.095 


0.396 


Retention 


-0.050 


0.523 


By Type of Outcome Criterion 


0.341 


Cognitive 


-0,018 


Affective 


-0.097 ' 


0.458 


Science Methods 


-0.110 


0.000 


Critical Thinking 


0.170 


0.000 


Creativity 


0.495 


0.530 


Logical Thinking 


0.403 


0.280 


Self-Concept 


0.420 


0.000 


Bv Method of Measurement (Instrument) 


Published 


0.088 


0.392 


Modified published 






& ad hoc 


0,065 


0,368 



2i: 



Sel f -Directed Study, continued 



By Calculation of Effect Size 
From raw data 0.079 
By. direct calculation 0.495 
Less trustworthy -0.092 



Standard 
Deviation 



0.348 
0.530 
0.386 



20 

2 
5 



By Source of Means 

Unadjusted posttest 0.063 

Covariance adjusted -0.275 

Pre-post differences 0.027 

Other 0.507 



Standard 
Deviation 



0.374 
0.106 
0.315 
0.376 



246 



Source Paper s. Studies which purported to investigate the use of source 
papers as an instructional system yielded 13 effect sizes; the mean effect 
size was 0.142 with a standard deviation of 0.206. The studies concerned 
were reported in 1962 and 1966, with mean effect sizes of 0.136 and 0.163 
respectively. All studies were reported as dissertations* 

The study reported in 1962 drew its subjects from grade 10 and the 
curriculum area utilized was that of life science, whereas the study repor- o 
ted in 1966 drew its subjects from grade 7 and the curriculum area utilized 
was general science* All effect sizes addressed the question of interven- 
tion effects iimiediately on conclusion of the intervention. Where cognitive 
outcome criteria were utilized, the mean effect size was 0.142 with a 
standard deviation of 0.171 (based on 9 effect sizes), and where affective 
outcome criteria were utilized, the mean effect size was -0.190. 



Standard 
Deviation 



Table 30 
SOURCE PAPERS 

A = 0.142 
s.d. = 0,206 
N = 13 



s 



Standard 
Deviation 



By, Year of Publication 

1962 0.136 0,199 10 

1966 0.163 0.274 3 

By Form of Reporting 

Dissertation 0.142 0.206 13 

By Grade Lev.el of Subjects 

0.163 0.274 3 

10 0.136 0.199 10 

By Validity of Design 

Random 0.163 0,274 3 
Matched & Intact 

Random 0.136 0,199 10 

By Subject Matter 

General Science 0,163 0.274 3 

Life Science 0.136 0.199 10 



By Immediate or Retention 
Immediate 0.142 

By Type of Outcome Criterion 
Cognitive 0.142 
Affective -0.190 
Science Methods 0.253 



0,206 



0.171 
0.000 
0.253 



By Method of Measurement (Instrument) 

Published 0.183 0.220 
Modified published 

& Ad hoc 0.007. 

a 

By Calculation of Effect Size 

By direct calculation 0.163 

Less trustworthy 0.136 



By Source of Means 

Pre-post differences 0.142 



0.006 



0.274 
0.199 



13 



0.206 



10 

3 



3 
10 



13 



o - 
(■J I 



A/ t 



248 \ 



\ Student Assisted Instructional System . The mean value of the ^effect 
sizes obtained in the evaluation of this system was 0.088 with a standard 

^deviation of 0.171, and all effect sizes- were ^obtained jn the year 1971. 

< 

The mean effect size derived from studies reported as journal articles 
was 0.048 with a standard deviation of 0.205 (based on 4 effect sizes), 
whereas the mean effect size derived from studies^ reported as dissertations 
was 0.170 with a standard deviation of 0.014 (based on 2 effect sizes). 
All effect sizes were derived from studies utilizing the general science 
curriculum area and in all cases only the immediate evaluation cf the inter 
vention effects was addressed. ^ 

In the case of, cognitive outcome criteria, the mean effect size was 
0.1Q5 with a standard deviation of 0.332 (based on, 2 effect sizes), and in 
the case of affective outcome criteria, the mean effect size was 0.170 with 
a standard deviation of 0.014 (based on 2 effect sizes). 



Table 31 

STUDENT-ASSISTED INSTRUCTIONAL SYSTEM 



By Date of Publication 

1971 0.088 

By Form of Reporting 

Journal article 0.048 

Dissertation 0.170 

By Grade Level of Subjects 

5 0.048 

6 0.170 

By Validity of Design 

Random 0.048 

Nonrandom 0.170 



By Subject Matfer 
General Science 



0.088 



Standard 
Deviation 



0.171 



0.205 
0.014 



' 0.205 
0.014 



0.205. 
0.014 



0.171 



A 

s.d. 
N 



N 
6 



4 

2 



4 

2 



4 

2 



0.088 
0.171 

6 



By Immediate or Retention 
Immediate 0.088 

By Type of Outcome Criterion 
Cognitive 0.105 
Affective 0.170 
Critical Thinking 0.020 
Creativity -0.040 



Standard 
Deviation 



0.171 



0.332 
0.014 
0.000 
0.000 



By Method of Measurement (Instrument) 
Published -0.050 0.076 

Modified published 

& Ad hoc 0.227 



By Calculation of Effect Size 
By direct calculation 0.088 

By Source of ! *r ■-< ns_ 

Unadjusted posttest 0,048 
Pre-post differences 0.170 



0.099 
0.171 



0.205 
0.014 



N 
6 



2 . 
2 
1 
1 



3 
3 



4 

2 



27 . 



9 

ERIC 



250 



Team Teaching , Forty-one effect sizes were obtained in this subsection 

of the meta-analysis producing a mean effect size of 0.058 with a standard 

deviation of 0.378. Studies were reported between the years 1961 and 1980 

with a large proportion being reported in 1962 and 1963; the minimum effect 

size of -0.365 v\as obtained in 1966 and the maximum mean effect size of 

0.730 was obtained in 1976. The mean effect size obtained from studies 

reported in journals was 0.190 with a standard deviation of 0.357 (based on 

8 effect sizes) and the mean effect size derived from studies reported in 

dissertations was 0.064, with a standard deviation of 0.347 (based on 26 

effect sizes), supporting the trend n<?ted earlier. The grade level of the 

I 

subjects concerned ranged from grade &M:o grade 12, With the" majority of 
effect sizes occurring in gracfe 10. The minimum mean effect size of -0.183 
was obtained in grade 12, and the maximum mean effect size of 0.165 was 
was obtained in grade 7. Fourteen effect sizes were derived from studies 
which made use of the random allocation of subjects to experimental and 
control groups! and the mean effect size in this case was -0.004 with a 
standard deviation of 0.492. In the case of matched al location to groups 
or random assignation of intact groups, the mean effect size was 0.161 
with a standard deviation of 0.313 (based on 19 effect sizes). In those 
studies in which nonrandom allocation was utilized, the mean effect size 
was -0.076 with a standard deviation of 0.238 (based on 8 effect sizes). 

In all, six different curriculum areas provided the underlying content 
basis for the evaluation of the team teaching system, with the minimum 
mean effect size of -0.490 occurring in physical science and the maximum 
mean effect size of 0.295 occurring in general science. Thirty-seven of 
the effect sizes addressed the question of immediate evaluation of inter- 
vention effects and the mean effect size in this case was 0.063. The 

erJc 27/ 



251 



mean effect size in the case of retention effects was 0.035 with a standard 
deviation of 0,007 (based on 2 effect sizes). Where cognitive outcome 
criteria were used, 31 effect sizes gave rise to a mean effect size of 
0.087 with a standard deviation of 0.409, and where affective outcome 
criteria were used, a mean effect size of -0.124 with a standard deviation 
of 0 # 235 (based on 7 cases) was registered. 

Twenty-three of the effect sizes owed their origin to the use of 
published test materials, and generated a mean effect size of 0.094 with a 
standard deviation of 0.394. The mean effect size derived from studies 
which made use of modified published test materials or investigator-authored 
test materials was 0.014, with a standard deviation of 0.361 (based on 
16 effect sizes)™ In the case of the 17 effect sizes generated by studies 
which reported raw data, the mean effect size produced was -0J01 with 
a standard deviation of C.374. In the case of the 9 effect sizes that 
were produced by the transformation of reported statistics, the mean 
effect size was 0.253, with a standard deviation of 0.479. 



ERLC 



Standard 
A Deviation 

By Date of Publication 

1961 0,032 0.273 

1962 0.136 0.199 

1963 0.026 0*521 
. 1965 0.188 ' 0.354 

1966 -0.365 0.021 

1968 -0.208 0.369 

1969 0.470 0.000 
1976 0.730 0.000 
1980 0.000 0.000 

By Form of Reporting 

Journal article 0,190 0.357 

Dissertation 0.064 0.347 

Unpublished article -0.255 0.355 

Conference paper 0.730 0.000 

By Grade Level of Subjects 

6 0.000 0.000 

7 0.165 0.926 
9 0.030 0.022 

10 0.123 0.430 

11 0.013 0.204 

12 -0.183 0.320 

By Va 1 id \ ty of Design 

Random -0.001 0.492 
Matched & Intact 

Random 0.161 0.313 

27.* Nonrandom -0.076 0.238 



ERJC 



Tabl e 32 
TEAM TEACHING 



A 

s.d, 

N 



0.058 
0.378 
41 



5 
10 
11 
5 
2 
4 
1- 
1 
2 



8 
26 
6 
1 



2 
2 
4 
21 
6 
4 



14 

19 
8 







Standard 


-J 




A 


ucvi a 1 1 on 


M 


By Subject Matter 








General Science 


0.295 


0.389 


4 


Life Science 


0.136 


0 1 99 


1 0 


Physical Science 


-0.490 


0 000 


1 


Biology 


0.062 


0 487 

U « HO/ 


1 u 


Chemistry 


-0.027 


0.286 


3 


Physics . 


-0.081 


0.270 


7 


By Immediate or-ftetention 






Immediate 


0,063 


0,398 




Retention 


0.035 


0.007 


2 


By Type' of Outcome Criterion 






Cognitive 


0.087 


0.409 


0 1 


Affective 


-0.124 


0.235 


7 


Science Methods 


0.183 


0.177 


3 


By Method of Measurement (Instrument) 




Published 


0.094 


0.394 


23 


Modified publ i shed 








& ad hoc 


0.014 


0.361 


1 u 


By Calculation of Effect Size 






From raw data 


-0.1 01 


0.374 


1 / 


By direct calculation 0.253 


0.479 


9 


Less trustworthy 


0.122 


0.239 


15 


By Source of Means 








Unadjusted posttest 


0.083 


0.507 


17- 


Covariance adjusted 


0.060 


0.095 


3 


Pre~post differences 


0.021 


0.270 


16 


Other 


0.094 


0.330 


5 








2oi) 



253 

Mean Effect Sizes Broken Down by Selected Variables Across Systems 

On the following pages, the mean effect sizes for all systems broken 
down by variables thought to be of interest are listed in order to facil 
tate inter-system comparison. 



n { 



Table- 33 

Effect Sizes by Form of Reporting for Each System 





A 

s.d. 
(N) 


Journal 
Arti cl es 


Disser- 
tations 


Unpubl ished 
Articl es 


Conference 
Papers 


All As 


0.103 
0.414 
(341) 


0.201 
0.480 


0.064 
0.377 
(214) 


-0.034 
0.360 
(25) 


0.508 
n.172 
(6) 


Audio- 
Tutorial 


0.170 
0.274 
(7) 


0.223 
0.268 
(3) 


0.130 
0.312 
(4) 






Computer 
Linked 


0.134 
0.583 
(14) 


1,340 
0.156 

I?.) ■ 


-0.121 
0.247 
(11) 




0.530 
0.000 
(1) 


CAI 


0.010 
0.743 

• (5) 


1.230 
0.000 


-0o295 
0.341 

SiO 






CMI 


0.048 
0.220 
(8) 




-0.021 
0.109 
(7) 




0.530 
0.000 
(1) 


CSE 


1.450 
0.000 
(1) 


1.450 
0.000 
(1) 








Contracts 


0.467 
0.605 
(12) 


0.610 
0.639 

(9 ) 


0.040 
0.114 

(->) 






Dept. 
El em. 
School 


-0.090 
0.165 
(3) 


0„080 
0.000 
(1) 


-0.175 
0.106 
(2) 






Indiv. 
Instr. 


0.174 
0.459 
(131) 


0.405 
0.519 

(29) 


0.102 
0.422 
(100) 




0.450 
0.113 
(2) 


Mastery 
Learning 


0.644 
0.430 
(13) 


0o713 
0,500 
(9) 






0.488 
0.161 

(4) 



9 

ERIC 



Q> 



Table 33 > continued 





A 

s.d. 
(N) 


Journal 
Articles 


Disser- 
tations 


Unpubl ished 
Articles 


Conference 
Papers 


An as 


0.103 
0.414 
(341) 


0.201 
0.480 
(96) . 


0.064 
, 0.377 
(214) 


-0.034 
0.360 
(25) 


0.508 
0.172 
(6) 


Med i a 
Das ea 


-0.023 
0.369 
(100) 


-0.005 
0.393 
(37) 


-0.012 
0.370 
(47) 


-0.097 
0.320 
(16) 




TV 


0.053 
0.347 
(40) 


-"■"Tf.lTo" 

0.194 

u do) 


0.026 
0.411 
(26) 


0.110 
0.157 
(4) 




Film 


-0.065 
0.378 
(58) 


-0.047 
0.440 
(27) 


-0.026 
0.311 
(19) 


-0.166 
0.335 
(12) 




PS I 


0.603 
0.423 
(15) 


0.713 
0.500 
(9) 




0.403 
0.280 
(3) 


0.473 
0.194 
(3) 


Prog. 
Instr. 


0.174 
0.475 
___I52L. 


0.301 
0,448 
i?j 


0.154 
0.480 
_____( 4 5) 






Branche 
Linear 


0.210 
d 0.798 

1 £51. 

0.170 
0.441 
(47) 


1.230 
0,000 

in 

0.147 
0.199 
(6) 


-0.045 
0.645 

Sh) 

0.173 
0.467 
(41) 








Self- 
Directed 


0.078 
0.375 
(27) 


0.138 
0,544 
(4) 


-0.010 
0. 328 
(19) 


0.403 
0.280 
(3) 


0.530 
0.000 

m 


Source 
Papers 


0.142 
0.206 
(13) 




0.142 
0.206 
(13) 


V 




Student 
Assisted 


0.088 
0.171 
(6) 


0.048 
0.205 
(4) 


0.170 
0.014 

(2) 




• 


Team 
Teaching 


0.058 
0.378 
(41) 


0.190 
0.357 
(8) 


0.064 
0.347 0 
(26) 


-0.255 
0.355 
(6) 


0.730 
0.000 
(1) 



o 

ERIC 



Table 34 

Effect Sizes by Grade Level of Subjects 
for Each System 





A 

s.d. 
(N) 


1 


2 


3 


4 


5 


6 


7 

7 


8 


9 


1 A 

10 


1 1 

11 


1 1 

12 


r 


All A 

All As 


0.103 
0.414 
(341) 


0.524 
' 0.289 
(5) 


-0.253 
0.280 
(3) 


.0.050 
0.479 
(7) 


-0.024 
0.151 
(10) 


0.121 
0.258 
(28) 


-0.074 
0.435 
(19) 


0.086 
0.293 
(28) 


0.315 
(25) 


0.115 

U.ZD 5 

(31) 


0.099 

/-\ i f\C. 

0 .4U5 
(63) 


0.152 
U.42U 
(76) 


0.008 

n c /. q 

(43) 


(3) 


Audio- 
Tutorial 


0.170 
0.274 
(7) 






0.000 
0.000 

(1) 


0.040 
0.000 

0) 










0.160 
0.375 
(3) 


0.335 
0.262 
(2) 








Computer 
Linked 


0.134 

r\ coo 

0.583 
(14) 










0.020 

n 1 no 
U • lUo 

(3) 










-0.053 
n Hi 

(4) 


0.143 
(3) 


0.400 

U . OH JL 

(4) 




/SAY 

CAI 


0.010 
0.743 
(5) 






















n o/.i 
u.y^fi 

(3) 


U. j jZ 

(2) 




CMI 


0.048 
0.220 
(8) 










0.0Y0 

0.108 
(3) 










-0.053 

n 117. 
0 .114 

(4) 




0.*530 

U.UUU 

(i) 




CSE 


1.450 
0.000 
(1) 
























lX50 

U ♦UUU 

(1) 




Contracts 


0.467 
0.605 
(12) 
















0.857 

U.4D/ 

(7) 


-0.255 
U.l / / 
(2) 




0.040 
n 1 1 a 

U . 11*4 

(3) 






Dept. 
El em . 
School 


-0.090 
n i 

U • 103 

(3) 










-0.175 

(2) 


0.080 
n ono 

\J * \J\J w 

(1) 
















Indiv. 
4 Instr. 


0.174 
0.459 
(131) 






o.ooo 

0.000 

(1) 


-0.007 
0.042 
(3) 


0.116 
0.302 
(7) 


-0.100 
0.461 
(8) 


0.027 
0.226 
(17) 


0.404 
0,585 
(15) 


0.192 
0.328 
(14) 


0.112 
0.440 
(20) 


0.215 
0.493 
(40) 


0.467 
0.723 
(6) 




Mastery 
Learning 


0.644 
0.430 
(13) 
















0.857 
0.467 
(7) 






0.368 
0.219 
(5) 


0.530 
0.000 
(1) 





(continued) 



Table 34, continued 





A 

s.d. 
(N) 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 




All As 


0.103 
0.414 
(341) 


0,524 
0,289 
(5) 


-0.253 
0,280 
(?) 


0,050 
^0,479 
(7) 


-0.024 
0.151 
(10) 


0.121 
0.258 
(28) 


-0,074 
0,435 
0.9) 


0.086 
0.293 
(28} 


0,315 
0,491 
(25^ 


0.115 
0.263 

nn 


0,099 
0,406 
(«> 


0,152 
0,420 
(76) 


0.008 
0,548 
(43) 


(3) 


Media 
Based 

TV 


-0.023 
0.369 
(100) 
"57653" 
0.347 
(40) 


0.393 
0.012 

12L 

0.393 
0.012 
(3) 


-0.253 

0.280 

._J3L 
-0.253 

0.280 

(3) 


0.058 
0.524 
...J61 
0.058 
0.524 
(6) 


-0.007 
0.187 

... ISl 

-0.007 
0.187 

* (I) 


0.130 
0. 228 

.Mil. 

0.197 
0.135 
V) 


-0.190 

0.561 

...ill. 
-0.118 

0.666 
(5) 


0.180 
0.000 

...ill. 

0.180 
0.000 
(1) 


0.045 
0. 120 

...ill. 
0.045 
0.120 
(2) 


0.116 
0.160 

..mi. 

0.090 
0.143 

C6) 


0.390 
0.445 
...til- 


0.C07 
0.327 

.mil 


-0.262 
0 o 284 

..Sill. 

-0.120 
0.000 

(1) 





Film 


-0.065 
0.378 
(58^ 










0.013 
0.329 
(4) 




0.180 
0.000 
(1) 




0.146 
0.190 
(5) 


0.390 
0.445 
(6) 


0.007 
0.327 
(17) 


-0.268 
0.288 
(24) 




PSI 


0.603 
0.423 
(15) 










0.403 
0.280 
0) 






0,857 
0.467 

,.(7) 






0.368 
0.219 
(5) 






Prog. 
Instr . 


0.174 
0.475 
i§2l_ 








-0.030 
0.014 
(2) 




-0.070 
0.521 

(7) 


0.023 
0.342 


-0.415 
0.205 
(2) 


0,216 
0.207 
(5) 


0.253 
0.276 

. illl 


0.270 
0.570 
(20j_ 


lo070 
0,000 

. (1) 




v Branche 


0.210 
d 0.798 






• 
















0.210 
0.798 
til- 






Linear 


0.170 
0.441 
(47} 








-0.030 
0.014 

(2) 




-0.070 
0.521 
(7) 


0,023 
0.342 
(4) 


-0.415 
0, 205 
(2) 


0,216 
0,207 
(?) 


0.253 
0.276 
(11) 


0.290 
0.508 
(15) 


1.070 
0.000 

(1) 





Self- 
Directed 


0.078 
0.375 
(27) 








0.040 
0.000 
(1) 


0.403 
0.280 
(3) 




0.050 
0.140 
(3) 




0.008 
Oo371 
(5) 


0.500 
0.342 
(4) 


-0.185 
0.208 
(8) 


0o050 
0.570 
(3) 




Source 
Papers 


0.142 
0.206 
{13) 














0.163 
0.274 
(3) 






0.136 
0.199 
(10) 








Student 
Assi sted 


0.088 
0.171 
. (6) 










0.048 
0.205 
(4) 


0.170 
0,014 
(2) 
















Team 
Teaching 


0.058 
0.378 
(41) 












0.000 
0.000 
(2) 


0.165 
0.926 

' (2) 




0,030 
0.022 
(4) 


0.123 
0.430 
(21) 


0.013 
0.204 
(6) 


-0.183 
0.320 

(4) 





Table 35 , 
Effect Sizes by Validity of Design 
for Each System 



♦ 


T 
S • u . 

(N) 


i '-, 

Random 


Matched & 

l 1 1 l rvo 1 1 uu in 


Nonrandom 


All As 


0.103 
-0.414 
(341) 


0.105 
0.454 
(117) 


0.169 
0.405 
(132) 


% 0.007 
0.355 
• (92) 


9 Audio- 

1 ULOrla 1 


0.170 
0.274 
(7) 


4 0.170 
0.274 
(7) 






Computer 


0.134 
0.583 
(14) 


0.470 
1.009 
(4) 


0.035 
0.367 
(6) 


-0.053 
0.114 
(4) 


CAI 


0.010 
0.743 
(5) 


0.143 
0.941 

i.3). 


-0.190 
0.552 

m 




CMI 


0.048 
0.220 
(8) 




0.148 
0.270 
(4) 


-0.053 
0.114 


CSE 


1.450 
0.000 
(1) 


1.450 
0.000 
(1) 






Contracts 


0.467 
0.605 
(12) 


0.857 
0.467 
(7) 


-0.078 
0.201 
(5) 


• 


Dept. 
El em • 
School 


-0.090 
0.165 
(3) 






-0.090 
0.165 
(3) 


Indiv* 
Instr. 


0.174 
0.459 
(131) 


0.-215 
0.494 
(56) 


0.175 
0.442 
(53) 


0.070 
0.409 

. (22) 


Mastery 
Learning 


0.644 
0.430 
(13) 


0.742 
0.434 
(10) 


0.530 
0.000 
(1) 


0.210 
0.184 
(2) 



Table 35, continued 





A 

s.d. 

(N) 


Random 


Matched &• 
Intact Random 


Nonrandom 


All As 


0.103 
0.414 
(341) 


0.105 
0.454 
(117) 


0.169 
0.405 
(132) 


0.007 
0.355 
(92) 


Media 
Based 


-0.023 
0.369 
(100) 


-0.219 
0.443 
(15) 


0.071 
0.266 
(41) 


-0.044 
0.402 
(44) 


TV 


6.055 
0.347 
(40) 


0.285 
0,686 
(2) 


0.086 
0.287 
(34) 


-0.320 
0.522 
(4) 


Film 


-0.065 
0.378 
(58) 


■ — -Tf.m'-"- 

0.407 
(ID 


KWb 

0.107 
(7) 


0.385 
(40) 


PSI 


0.603 
0.423 
(15) 


0.742 
0.434 
(10) 


0.403 
0.280 
(3) 


0.210 
0.184 
(2) 


Prog. 
Instr. 

Branche 
Linear 


0.174 
0.475 

—1521. 
0.210 

a 0.798 

0.170 
0.441 
(47) 


0.173 
0.413 
(15) 


0.186 
0.467 
(31) 


0.113 
x 0.710 
(6) 


TT.IW ' 

0.941 
(3) 


TTMT) 

0.863 
(2) 




OTIBO 

0.236 
(12) 


— v.m — 

0.454 
(29) 


6. 113 

0.710 
(6) 


Self- 
Di rected 


0.078 
0.375 
(27) 


0.107 
0.436 
(12) 


0.055 
0.334 
(15) 




Source 

Pa npr<; 


0.142 
0.206 
(13) 


0.163 
0.274 

C 

(3) 


0,136 
0.199 
(10) 




Student 
Assisted 


0.088 
0.171 
C6) 


0,048 
0,205 
(4) 




0.170 
0.014 

(2) 


Team 

Teaching 


0.058 
0.378 

(41) 


-0,004 
0,492 
(14) 


0,161 
0.313 
(19) 


-0.076 
0.238 
(8) 



? 



Table 36 
Effect Sizes by Subject Matter 
for Each System 





A 

s.d. 
(N) 


General 
Science 


Life 
Science 


Phys i ca 1 
Science 


Biology 


Earth 
Science 


Chem- 
i stry 


Physics 


y 

Other 


All As 


0.103 
0.414 
(341) 


0.090 
0.315 
(100) 


0.155 
0.201 
(12) 


0.134 " 
0.286 
(16) 


—07150" 
0.483' 
(76) 


0.084 
0.216 
(7) 


' 0.146 
0.441 

(73) 


-0.014 
0.508 
(54) 


0.093 
0.330 

(3) 


Audio- 
Tutorial 


0.170 
0.274 
(7) 


0.020 
0.028 
(2) 






0.230 
0.311 

(5) 






i 




Computer 
Linked 


0.134 
0.583 
(14) 


0.020 
0.108 

C3) 










0.143 
0.941 
(3) 


0.174 

0.606 
(8) 




CAI 


0.010 
0.743 
(5) 












0.143 
0.941 

(3) 


-0.190 

0.552 
(2) 




CMI 


0.048 
0.220 
(8) 


0.020 
0.108 

(3) 












0.064 
0.279 
(5) 




CSE 


1.450 
0.000 
<D 














1.450 
0.000 
(1) 




Contracts 


0.467 
0.605 
(12) 








0.610 
0.639 
(9) 




0.040 
0.114 

(3) 






Dept. 
El em. 
School 


-0.090 
0.165 
(3) 


-0.090 
0,165 

(3) 
















Indiv. 
Instr. 


0.174 
0.459 
(131) 


0.016 
0.252 
(36) 


0.430 

o.ooo 

(1) 


0.216 
0.271 
(10) 


0.265 
0.550 

(30) 


0.000 
0,000 
(1) 


0.204 
0.508 

(43) 


0.323 
0.652 
(9) 


0.030 
0.000 
(1) 


Mastery 
Learning 


0.644 
0.430 
(13) 








0.857 
0„467 
V) 




0.368 
0,219 
(5) 


0.530 
0.000 
(1) 





ERIC 



Table 36 » continued 





A 

s.d. 

(N) 


General 


Life 
oci enc c 


Physical 
oti ence 


Biology 


Earth 


Chem- 

1 stry 


Physics 


Other 


All As 


0.103 
0.-414 
(341) 


0.090 
0.315 
(100) 


0.155 
0.201 
(12) 


0.134 
0.286 
(16) 


0.150 
0.483 
(76) 


0.084 
0.216 
(7) 


0.146 
0.441 
(73) 


-0,014 
0.508 
(.54) 


0.093 
0.330 


Media 
Based 


-0.023 
0.369 
(100) 


0.066 
0.328 
.—136}.. 




0.096 
0.159 
.... 15}.. 


0.149 
0.477 
— IA5}_, 




-0.009 
0,324 
—115}.. 


-0.277 
0,288 
.—127}.. 


0.125 
0.460 
.... !2}_. 


TV 

Film 


0.053 
0.347 
(40) 

""67665" 

0.378 
(58) 


0.092 
0.342 

.... 126}.. 
0.090 
0.244 
(8) 





0,096 
0.159 

L5}.- 


-0.049 
0.495 

._.J7}_. 
0.323 
0,414 
(8) 





" "60609" " 
0.324 
(15) 


-0.160 
0.057 
—J2} . 
-6,287 
0,298 
(25) 


~~6'l25"' 
0.460 
(2) 


PS I 


0.603 
0.423 
(15) 


0.403 
0.280 

(3) 






0.857 
0.467 
(7) 




0.368 
0.219 
(5) 






Prog. 

1 lib IV • 


0.174 
0.475 


-0.065 
0,342 
.— Ii2}_. 


0.430 
0.000 


0.148 
0.161 

....i£L. 


0.055 
0.424 
..J12X.. 




0.291 
0.550 
..J22}.. 


0.533 
0,516 
13}.. 




Branche 


0.210 
d 0.798 
(*) 












0,210 
0.798 

.1.J5L. 






Linear 


0.170 
0.441 

(4,7) 


-0.065 
0.342 
(10) 


0,430 
0.000 
(1) 


0.148 
0.161 

(4) 


0.055 
0.424 
(12) 




0,315 
0.485 
(17) 


0.533 
0.516 
(3) 




Self- 
u l rectea 


0.078 
0.3.75 
(2b) 


0.200 
0.263 
(7) 






0,172 
0.479 
(6) 


0,000 

0,000 

(1) 


-0.047 
0,355 
(10) 


0.050 
0.570 
(3) 


\ - 


Source 
Papers 


0.])42 
0.206 

(h) 


0.163 
0.274 

(3) 


0,136 
0.199 
(10) • 














Student 
Assisted 


0.083 
0.171- 
(6) 


0.088 
0.171 

(6) 
















Team 
Teaching 


0.058 
0.378 
(41) 


0.295 
0.389 
(4) 


0,136 
0.199 
(10) 


-0.490 
0.000 

(1) 


0,062 
0.487 
(16) 




-0,027 
0.286 
(3) 


-0,081 
0.270 
(7) 





Table 37 

Effect Sizes by Immediate or Retention Measures 
for Each System 



\ 


A 

(N) 


t 

IMMEDIATE 

A 1 N lira U tn 1 


RETENTION 


All Aq 


0.103 

0 ilA 

W • tin 

(341) 


0 o 126 
0.430 
(290) 


-0.093 
0.250 
(33) 


Audio- 
Tutorial 


0.170 
0.274 
(7) 


0.223 

0 968 
(3) 


(0) 


Computer 
Linked 


0.134 
0 583 
(14) 


0.221 

n 626 

illj 


-0.183 
0.240 
(3) 


. CAT 


0.010 

u»/HJ 

(5) 


0 o 118 

JSJI 


-0.420 
0.000 
(1) 


PMT 


0.048*1 
(8) 


0.085 
(6) 


-0,065 
0.177 
(2) 




1.450 
(1) 


1.450 
ft Ofto 

(1) 


(0) 


Contracts 


0.467 
0.605 
(12) 


0.522 
0.603 
(11) 


-0.130 
0.000 
(1) 


Dept. 
El erff. 
School 


-0.090 
0.165 
(3) 


-0.090 
0.165 
(3) 


(0) 


Indiv. 
Instr, 


0.174 
0.459 
(131) 


0.220 
0.482 
(108) 


-0.109 
0.234 
(12) 


Mastery 
Learning 


0.644 
0.430 
(13) 


O 0 644 
0.430 • 
(13) 


(0) 





A 

s.d • 
(N) 


IMMEDIATE 

A 1 l| * mm 1/ A f I 1 mm 


RETENTION 


Media 
Based 

TV 


-0.023 
0.369 
(100) 

""57055" 

0.347. 
(40) 


-0.009 
0 377 



0.055 

\J • <J*-t / 

_(40j 


-0.133 
0.347 

.(12L- 

(0) 


Film 


-0.065 
0.378 
(58) 


-0.051 
0.399 
(43) 


-0.133 
0.347 
(12) 


PS I 


0.603 
0.423 
(15) 


0.603 
' 0.423 
(15) . 


CO) 


Prog. 
Instr. 

Branche 

(W 1 14 |l vll X* 

Li near 


0.174 

U . 4 / J 

..Sill. 

0.210 
i n 7Qft 

L..151. 

0.170 
n LL] 

UtHHl 

(47) 


0.260 
n AQ7 

iftfflL — 

0.590 

\J .O.J4 

. 

0.234 
(37) 


-0.113- 
0.276 

18)... 

-0.360 
0.085- 

C 2 J - 

-0.030 
(6) 


Sel f- 
Directed 


0.078 
n ^7 s 

(27) 


0,095 
(20) 


-0.050 

0.523 

(2)",. 


Source 

Pa ner<£ 


0.142 
0.206 
(13) 


0.142 
0.206 
(13) 


> 

(0) 


Student 
Assisted 


0.088 
0.171 
(6) 


0,088 
0.171 
C£> 




Team 
Teaching 


0.058 
Q.378 

(41) 


0.063 
0,398 * 
(37) 


0,035 
0.007 

(.2) 



o 

ERIC 



293 



Table 38 

Effect Sizes by Type of Outcome Criterion 
for Each System 





A 

s.d. 
(N) 


Cognitive 


Affective 


Science 

ric ci iuuo 


Psycho- 

nu cu i 


Critical 

Th inh'nn 
i ii 1 1 1 n 1 1 1 y 


Crea- 
tivity 


Self 


Logical 

Thi n k i nn 
i ii i ni\ iiiy 


All As 


0.103 
0.414 
(341) 


0.069 
0.407 
(249) 


0,034 
0,310 
(45) 


0.299 
0.415 
(3 9) 


0,892 
0.684 
(6) 


0.234 
0.311 
(7) 


0.430 
0,457 



0,317 
0.100 
(3) 


0.403 
0.280 
(3) 


Audio- 

1 u cur I u 1 


0.170 
0.274 
(7) 


0.088 
0.287 
(5) 


0,330 
0,000 
(1) 










0,420 
0,000 
(1) 




Computer 

1 inkpH 

U 1 II NCU 


0.134 
0.583 
(14) 


0.216 
0.618 


-0.167 
0,359 














CAI 


0.010 
0.743 
(5) 


0,158 
0.769 


-0.580 
0,000 

pi 














CMI 


0.048 
0.220 
(8) 


0.050 
0.260 

(6) . 


0,040 
0,028 
(2) 














CSE 


1.450 
0.000 
(1) 


1.450 1 
0.000 
(1) 
















Contracts 


0.467 
0.605 
(12) 


0.218 
0.569 
(5) 


omo 

0,449 
(3> 


1.235 
0.714 
(2) 




0.530 
0.509 
(2) 








Dept. 
El em. 
School 


-0.090 
0.165 
(3) 


-0.090 
0,165 
(3) 
















Indiv. 
Instr. 


0.174 
0.459 
(131) 


0,118 
0.440 
(102) 


0.160 
0,373 
(10) 


0.428 
0.565 

(9) 


1.165 
0.064 

(2) 


0,325 
0,405 
(4) 


0,495 
0,530 
(2) 


0.365 
0,078 
(2) 




Mastery 
Learning .. 


0.644 
0.430 
(13) 


0,498 
0.278 
(8) 


0,515 
0.446 

(2) 


1.235 
0.714 
(2) 




0,890 
0,000 

m 









'2'M. 



Table 38, continued 





A 

s.d. 
(N) 


Cognitive 


Affective 


Science 

Mpf hnds 


Psycho- 

UlU 1>U 1 


Critical 

Th i nk i no 

I ii ill rv 1 1 im 


Crea- 

f i v i t v 


Self 
Tone pnt 


Logical 
Thi nk1 na 

1 II IllrV 1 MM 


All As 


0.103 
0.414 
(341) 


0.069 
0.407 
(249) 


0.034 
0.310 
(45) 


0.299 
0.415 
(19) 


0.892 
0.684 
(6) 


0.234 
0.311 
(7) 


0.430 
0.457 

(A) 


0.317 
0.100 
(3) 


0.403 
0.280 
(3) 


Media . 
Ba N s ed 


-Q.023 
0.369 
(100) 


-0.030 
0.388 
(75) 


-0.104 
0.298 
(16) 

"•-o-.rar" 

0.000 

(1) 


0.118 
0.143 
(5) 

0.087 
(4) 


-0.080 
0.000 

CO— 


0,160 

0.014 
_ _ (2) __ 
""0.150 " 

0.000 
(1) 


0.770 

(1) 

0"."770""" 
0.000 

(1) 


---------- 


»«•«• 


TV 


6.055 
0.347 
(40) 


0.355 
1 (33) 


Film 


-0.0>5 
0.378 
(58) 


-o - .o5r~ 

0.416 
(40) 


0.309 
(15) 


"-o".loo""' 

0.000 

(1) 


--ir.w: 

0.000 

(1) 


0.000 
(l) 








PS I 


0.603 
0.423 
(15) 


0.493 
0.300 
(7) 


0.515 
0.446 

(2) 


1,235 
0.714 
(2) 




0.890 
0.000 
(1) 






0.403 
0.280 
(3) 


Prog, 

Tnc-f-p 


0.174 
0.475 
...{521. 


0,173 
0.479 
...J511. 


0.200 
0.000 

in.-. 














Branche 


0.210 
a 0.798 

L__m_ 


0,210 
0.798 

iSL. 
















Linear 


0.170 
0.441 
(47) 


0,169 
0.446 
(461 


0,200 
0.000 

m 














Sel f- 

Hi v* art oH 


0.078 
0.375 
(27) 


-0.018 
0,341 
(16) 


-0.097 
0,458 
(3) 


-0.110 
0.000 

(1) 




0.170 
0,000 
(1) 


0.495 
0.530 
(2) 


0.420 
0.000 
(1) 


0.403 
0.280 
(3) 


Source 

Da novc 

rd pcrb 


0.142 
0.206 
(13) 


0.142 
0.171 
(9) 


-0.190 
0.000 
(1) 


0.253 
0.253 
(3) 












Student 
Assisted - 


0.088 
0.171 
(6) 


0.1O5 
0.332 


0.170 
- 0.014 

d\ 
UJ 






0.020 
0.000 
,\±1 . 


-0,040 
0.000 






Team 
Teaching 


0.058 
0.378 

(41), 


0,087 
0.40° 
(31) 


-0,124 
0.235 
(7) 


0.183 
0.177 
(3) 













ERjc"'" 

/ 



Table 39- / 
Effect Sizes by Method of Measurement 
for Each System 





1 

s.d. 
(N) 


Published 


Modified Publ. 
& Ad hoc 


Other 
< Assessment 


All As 


0.103 
0.414 
(341) 


0.045 
0.387 
(173) 


0.126 
0.393 
" (158) 


0.951 
0.486 

- (S) 


Audio- 
Tutorial 


0.170 
0.274 
(7) 


0.375 
0.064 
(2) 


. 0."088 
0.287 
(5) 




Computer 
Linked 

CAI 
CMI 


0.134 
0.583 
(14) 

'"oToio 

0.743 
(5) 
'""57048* 

0.220 
(8) 


-0.158 
0.256 

j,5) 

-0,580 
0,000 

jl} 

-0.053 
0.114 

. - „ ill . 


0.297 

0.661 
J9) 

0.158 

0,769 
„j4)_____ 

0.148 

0,270 
J4) 




CSE 


1,450 
0.000 
(1) 




1.450 - 
0.000 
(1) 




Contracts 


0.467 
0.605 
(12) 


0.467 
0.605 
(12) 






Dept. 
El em. 
School 


-0.090 
0.165 
(3) 


-0.090 
0.165 
(3) 






Indiv. 
Instr. 


0.174 
0.459 
(131) 


0.159 
0.442 
(65) 


0.159 
0.453 
(64) 


1.165 
0.064 
(2) 


Mastery 
Learning 


0.644 
0.430 
(13) 


0.713 
0.500 
(9) 


0.488 
0.161 
(A) 






9'} 

<W 1 



Table 39, continued 





A 

s .d. 

(N) 


Pub! ished 


Modified Publ. 
& Ad hoc 


Other | 
Assessment \ 


All As 


0.103 
0.414 
(341) 


0.045 
0.387 
(173) 


0.126 
0.393 
(158) 


0.951 
0.486 
(8) 


Me'dia 
Based 

TV 


-0.023 
0.369 

ft r\r\\ 

(J.uu; 
""57053" 
0.347 
(40) 


-0.081 
0.351 

/CI \ 

J51J 

0.020 
0.119 
(7) 


0.038 
0.381 . 

1?9J 

0.063 

0»379 

(33) 




Film 


-0.065 
0.378 
(58) 


0.377 

//ON 

(42) 


-0.OL4 
0,390 
(16) 




PS I 


0.603 
0.423 
(15) 


0.713 
0.500 
(9) 


0,438 
0,219 
(6) 




Prog. 
Instr.< 


0.174 
0.475 


0.258 
0.394 
(10) 


0.154 
0.494 

J42J 




Branche 
Linear 


0.210 
a 0.798 

. £5i_ 

0.170 
0.441 
(47) 


6~258 ' 
0.394 
(10) 


0.210 
0,798 

/ c \ 

!5J 

0,146 
0.455 
(37) 




Sel f- 
Directed 


0.078 
0.375 
(27) 


0,080 
0.392 
(16) 


0.065 
0.368 

(11) 




Source 
Papers 


0.142 
0.206 
(13) 


0.183 
0.220 
(10) 


0.007 
0,006 

(3) 




Student 
Assisted 


0.088 
0.171 
(6) 


-0.050 
0,076 
(3) 


0,227 
0.099 

(3) 




Team 
Teaching 


0.058 
0.378 
(41) 


0,094 
0.394 
(23) 


0.014 
0.361 
(16) 





Table 40 

Effect Sizes by Calculation of Effect Size 
for Each System 





A 

s.»d, 
(N) 


From 
raw data 


By direct 
calculation 


Less trust- 
worthy method;i 


All As 


0.103 
0.414 
(341) 


0.099 
0.435 
(179) 


0.144 
0.422 
(117) 


0.013 
0,275 
(45) 


Audio- 
Tutorial 


0.170 
0.274 
(7) 


0.130 
0.312 
(4) 


0,335 
0,262 
(2) 


0.000 
0,000 
(1) 


Computer 

Linked 
* 

CAI 


0.134 
0.583 
(14) 

"oToio" 

0.743 
(5) 


0.149 
0,640 
(11) 


-C.145 
0.064 
(2) 


0.530 
0,000 
(1) 


— v.m 

0.743 
(5) 






CHI 


0.048 
0.220 
(8) 


0,028 
0.079- 
(5) 


-0.145 

0,064 
(2) 


0,530 
0.000 


CSE 


1.450 
0.000 
(1) 


1,450 
0.000 
(1) 






Contracts 


0.467 
0.605 
(12) 


-0,078 
0.201 
(5) 


0,857 
0.467 • 
(7) 




pept. 
El em. 
School 


-0.090 
0.165 
(3) 


0,080 
0.000 
(1) 


-0,250 
0,000 
(1) 


-0,100 
0,000 

m 


Indiv. 
Instr. 


0.174 
0.459 
(131) 


0,176 
0.476 
(72) 


0.236 
0,469 
(45) 


-0,032 
0.258 
(14) J 


Mastery 
Learning 


0.644 
0.430 
(13) 


0,473 
0.194 
(3) 


0.857 
0.467 
(7) 


0.317 
0.226 
(3) 



3u0 



Table 40, continued 



* 


A 

s.d. 

(N) 


From 

raw Uata 


By direct 
Cal cu latioji 


Less trust- 
woruiy uietnoas 


All As - 


0.103 
0.414 
' (341) 


0.099 
0.435 

(1791- 


0.144 
0.422 
i (117) 


0,013 
0.275 
(45)' 


Media 
Based 

TV 


-0.023 
0.369 
(100) 

"~O70 5 5" 
0.347 
'(40) 


-0.080 
0.345 

0.01& 
0.428 
(12) 


0.055 
0.413 



0,066 

0,341 
(23) 


-0.097 

0,244 

.a*)__ 

0.096 
0.159 

SSL— 


Film 


-0.065 
0.378 

(581 


-0.101 
0.308 
(28) 


0.044 
0.488 
(21) 


-0,204 
0.219 
(9) 


PS I 


0.603 
0.423 
(15) 


0.438 
0,219 
(6) 


0.857 
0.467 
(7) 


0,210 
0,184 
(2) 


Prog* 

T nst - y 


0.174 
0.475 


0,173 
0.485 
(43) 


0.373 
0.420 
(6) 


-0.207 
0.140 
(3) 


Branche 


0.210 
a 0.798 

L 151. 


0,210 
0.798 

(5) 






Linear 


0.170 
0.441 
(47") 


0,168 
0.445 

(38) 


ft .373 

0.420 
(6) 


-0,207 
0,140 
(3) 


Self- 

U 1 1 ct L»CU 


0.078 
0.375 

(27), 


0.079 
0.348 
(20) 


0.495 
0,530 
(2) 


-0,092 
0,386 
(5) 


Source 

rapers 


0.142 
0.206 
(13) 




0.163 
0,274 
(3) 


0.136 
0,199 
(10) 


Student 
Assisted 


0.088 
0.17.1 
(6) 




0.088 
0,171 

\SJL. — 




Team 

Teaching 


0.058 
0.378 
(41) 


-0.101 
0.374 
(17) 


0.253 
0.479 
(9) 


0,122 
0.239 
(15) 



Table 4] 
Effect Sizes by Source of Means 
for Each System 





A 

s.d. 
(N) 


Unadjusted 

rOS t tcS t 


Covariance 

AH -1 1 ic +■ oH 


Pre post 


Other 


All As 


0.103 
0.414 
(341) 


0.125 
0.448 
(162) 


0.086 
0.387 
(67) 


0.087 
0.382 
(93) 


0.024 
0.358 
(18) 


Audio- 
Tutorial 


0.170 
0.274 
(7) 


0 o 0400 
0.0000 
(1) 




0.230 
0„311 
(5) 


0.000 
OoOOO 
(1). 


Computer 

LI nKcQ 


0.134 
0.583 
(14) 


-0.174 
0.269 

' (8) 


■r 
i 

i 


0.548 
0.731 
(5) 


0.530 
0.000 
(1) 


CAI 


0.010 
0.743 
(5) 


, -0.295 
\ 0.341 

\ (4) 




1„230 
0.000 

ay 




CMI 


0.048 
0.220 
(8) 


0\114 
X4) 




0.020 
O.108 

L (3) 


0.530 
0.000 

LU. 


CSE 


1.450 
0.000 
(1) 






1.450 
0.000 
(1) 




Contracts 


0.467 
0.605 
(12) 


-0.078 
0.201 
(5) 


0.857 
0.467 
(7) 






Dept. 
El em. 
School 


-0.090 
0.165 
(3) 




-0.010 
0.127 
(2) 


-0.250 
0.000 

(1) 




Indiv. 
Instr. 


0.174 
0.459 
(131) 


0.176 
0.514 
(51) 


0.198 
0.467 

(32) 


0.150 
0.412 
(40) 


0.190 
0.330 
(8) 


Mastery 
Learning 


0.644 
0.430 
(13) 


0.368 
0,219 
(5) 


0.857 
0.467 
(7) 




0.530 
0.000 
(1) 



30,' 



Table 41 , continued 



• 


A 

s.d. 

(N) 


Unadjusted 
Posttest 


Covariance 
Adjusted 


Pre-post c 
Difference 


Other 


All As 


C.103 
0.414 
(341) 


0.125 
° 0.448 
, (162) 


0.086 
0.387 
(67) 


0.087 
0.382 
(93) 


0.024 
0.353 
(W) 


Media 
Based 


-0.023 
0.369 
(100) 


-0.042 
0.393 
(40) 


-0.048 
0.279 
1261 


0.071 
0.416 

_<u2_8j_ 


-0.225 
0.250 

JL6JL 


TV 


0.055 
0.347 
(40) 


0.018 
0.428 
(12) 


0.144 
0.171 
(9) 


0.041 
0.372 
(18) , 


-0.040 
0.000- 

JJ-1 


Film 


-0.065 
0.378 
(58) 


-0.044 
0.386 

(26)- 


-0.149 
0.274 
(17) 


0.125 
0.503 
(10) 


-0.262 
0.260 

(5) 


PS I 


0.603 
0.423 
(15) 


0.381 
0.224 
(8) 


6.857 
0.467 
(7) 






Prog. 
I nstr • 

Branche 


0.174 
0.<475 

1321- 

0.210 
d 0.798 


0.242 
0.495 
(30) 


-0.003 
0.477 
(3) 


0.095 
0.446 
(19) 




•—-TT.OT "' 

0.028 
(2) 




-■""07617" 

0.809 
(3) 




Linear 


0.170 
0.441 
C47") 


0.288 
0.480 

(7R\ 


-0.003 
0.477 
CD 


-0.003 
0.295 
Cliy 




Self- 

\J II cue cu 


0.078 
0.375 
f27) 


0.063 
■ 0.374 
(15) 


-0.275 
0.106 

(2) 


0.027 

0.315 

(7) 


« 0.507 
0.376 
(3) 


Source 

ru pC i o 


0.142 
0.206 
(13) 






0.142 
0.206 
(13) 




Student 
Assisted 


0.088 
0.171 
(6) 


0.048 
0.205 
(4) 




0.170 
0.014 

(2) 




Team 

Teaching 


0.058 
0.583 
(41) 


0.083 
0.507 
(17) 


0.060 
0.095 
(3) 


0.021 
0,270 
(16) 


0.094 
0.330 
(5) 



Uk 30 3 



271 



CONCLUSIONS 

Although it must be done with caution, it is possible to draw some 
broad generalizations from the integration of research studies on science 
instructional systems. The most successful innovative systems appear to 
be mastery learning (J* = .64 overall and J. = .^50 for ^cognitive achievement) 
and P.S.I. (J, = .60 overall and L - .49 for cognitive achievement). Specific 
data on the various other outcome variables displayed in Table 6 verify that, 
in addition to being approximately one-half standard deviation better than 
control groups on cognitive measures, these two systems look good on other 
variable as well. On the other hand, media based systems in general appear 
to perform at a lower level than the traditional instruction used as the control 
group treatment. Most of the remaining systems operate at a level very little 
higher than the conventional instructions they have replaced; most have an 
average effect size approximating 0.1 standard deviations both on outcome measures 
overall and on cognitive measures. When compared with conventional instruction, 
instructional systems, do not show a striking advantage C ~ .10) and the impact 
in terms of affective measures is practically nothing = .04). 

Making additional broad generalizations is difficult because of the small 
number of effect sizes found for many outcome variables. In addition, the 
number of different instructional systems for which there is data on a given 
outcome variable is generally very small. As a result, it is difficult to 
make generalizations about instructional systems broadly, since the data 
provided in this meta-analysis is limited to only a few instructional 
systems for a particular outcome variable* In the case of three variables, 
however, some generalizations may be possible* 

For science methods, critical thinking, and logical thinking, the number 
of effect sizes is large enough, and the diversity of teaching systems 
evaluated with respect to a particular outcome variable is diverse enough, 
:RJC at one can say something about instructional systems in general^ A 



272 



review of Table 10b indicates there is an average effect size in favor of 
the instructional system of approximately .40 for these three outcome variables. 

The most important conclusions of this meta-analysis however, do not pertain 
to instructional systems overall but to particular systems. The data in Table 10b 
is instructive in this regard, A potential recording of data for an outcome 
variable under more than one instructional system in Table 10b is not a concern 
because data (whether used for another instructional system also or not) are 
indicative of the impact of the particular inbtructional system under consideration. 

A related point is that meaningful interpretation of the results of this meta- 
analysis with respect to a given instructional system requires careful analysis 
and examination of that system. One must know what it is about each system that 
makes it work, and in particular what it is that the most successful systems 
have in common. Such a review requires that one look at the characteristics 
of the various systems and determine what makes each one successful enough to 
stand out. An example of such an endeavor is meta-analysis work done in higher 
education (Kulik & Kulik, 1979) which identifed P.S.I, and some other instructional 
systems as being useful on the college level. In their examination of these 
instructional systems, they stated that a key characteristic held in common 
by these successful approaches was frequent testing with immediate feedback. 
While it is pleasing to see commonality between the results of the meta-analysis 
reported here and the work of another researcher at the college level, the 
key point to be made here is that the interpreter of these results must look 
beyond simple labels or even rather extended definitions as reported in this 
paper and analyze carefully what the components of each instructional system 
are. Such careful analysis work may make it possible to identify the key 
facets of instructional systems which are essential for their success. The 
results of such interpretive work are of value to practitioners in the field 
and to researchers needing to identify the elements of instruction with the 
q st potential for increasing learning. 

ERIC 3nj 



273 

\ 



SECTION IV. STUDIES INCLUDED IN THIS REPORT 



ERIC . 



274 



NUMERICAL LIST OF CODED STUDIES 
(For more details, see alphabetical list by authors) 



Number Author 

2001 Anderson, C. J. 

2002 Williams, W. W. 

2003 Charles, E. 

2004 Young, P. A. 
\ 2005 Koenig, H. G. 

• 2006 Grooms, H. H. 

2007 Wachs, S. R. 



Source* Measure(s) Used 



2008 Ward, P. E. 

2009 Williams-, H. R. 



2010 Krockover, G. H. 



4 

4 

4 
4 
4 

4 
4 



4 
4 



CHEM Study Chemistry, 
Chapters 1-7 

Comprehensive Test in Basic 
Physical Science 

Ad hoc** Cloze tests 

Ad hoc Biology achievement 



Modified Minnesota High 
School Achievement Exam 

Metropolitan Achievement Test 

Test on Understanding Science 
(TOUS), Form Jx 

Ad hoc physics exam 

Ad hoc biology exam 

STEP Science, Form A 

ACS-NSTA Cooperative Exam, 
High School Chemistry, 
Form 1961 

TOUS 

Thurstone Interest Schedule 

Purdue Master Attitude Scale 

ACS Cocp. Exam, General 
Chemistry, Form 1 963 

TOUS, Form 2 

Watson-Glaser Test of Criti- 
cal Thinking 

V 

Prouse Subject Preference Survey 



9 

ERLC 



*l=journal; 2=book; 3=master ! s thesis; 4=doctoral dissertation; 5=unpublished; 
6 s paper presented at a conference. 

**"Ad hoc" indicates instruments created by the investigator for the study* 

3u/ 



275 

2011 Scarptno, F. L. 



2012 Moore, B. F. 

2013 Fryar, W. R. 

2014 Eshleman, W. H. 

2015 Marshall, G. 

2016 Darnowski, V. S. 

2017 McKee, R. J. 

2018 Joslin, P. H. 

« 

2019 Dasenbrock, D. H. 

2020 - Molotsky, L. L. 

2021 Inventash, H. 



2022 Humphreys, D. W. 

2023 Stedman, C. H. 

O 

ERIC 



4 Co-0p Chemistry Test, 
Forms A, B. 

Anderson-Fisk Chemistry Test, 
Forms E, F, 

DuBelle Student Preference 
Report, Forms A, B, 

Ad hoc lab skills test 

4 Brown-Holtzman Survey of 
Study Habits & Attitudes 

BSCS Comprehensive Final Exam 
4 Ad hoc, "Aquatic Life" 
4 Ad hoc cognitive test 
4 Ad hoc cognitive test 

School midterm exams 

School final exam 
4 Ad hoc cognitive test 
4 Ad hoc cognitive test 

Ad hoc application test 
4 N.Y. State Regent's Exam 

Ad hoc cognitive test 

4 Ad hoc cognitive test 

4 Ad hoc Biol. Achievement Test 

4 Co-Op Science Test 

STEP Achievement Tests, • 
Forms 3B, 3A. 

T0US 

4 BSCS Comprehensive Exam 

Ad hoc Q-sort 
4 . Ad hoc general achievement test 



276 - j 



2024 



Turpin, G. R. 



2025 Thornton, W. T, 

2026 Waine, S. I. 



2027 



2028 
2029 

2030 



2031 



2032 
2033 



Metller, R. D. 



Summer! in, L. R, 
Wtegand, C. H. 

James, R. K. 



Slattery, J. B, 



Braly, J. L. 
Reed, L. H. 



4 Ad hoc semantic* differential - 
scale 

4 Ad hoc achievement test 

4 Chemistry I : Atomic 
Structure and Bonding 

4 Nelson Biology Test 

Purdue Student Attitude Test 
Dunning Physics Test 

4 Ad hoc achievement test 

4 Every Pupil Achievement 
Test, Elementary Science, 
Grades V-Viri. 

4 Ad hoc Seventh Grade 
Matter Final Exam 

Metropolitan Achievement 
Test-- Adv. Science Test 

Anderson-Fisk Chemistry Test, 
Form E. 

Read General Science Test 

Facts About Science Test 

Watson-Glaser Critical 
Thinking Appraisal 

TOUS 

4 New York Regents Exam, Biology 
New York Regents Exam, Chem. 
New York Regents Exam, Physics 

4 ACS-NSTA Chemistry Exam 

4 Stanford Achievement Test: j 
Science. 

Remmer's Attitude Toward Any 
School Subject Seal e 

Piers-Harris Children's Self- 
Concept Scale 



9 

ERIC - 



277 

4 

2034 Koch, D. P. 

2035 Taff.el, A. 

2036 Call, R. L. 

2037 Blank, S. S. 

2038 Motttllo, J. L. 

2039 Denton, J. J. 

2040 Payne, C. R. 

2041 White, R. W. 
•2042 Heffernan, D. F. 

2043 Carnes, P. E, 

2044 O'Toole, R. J. 

2045 Connor, J. L. 

2046 Breedlove, C. B. 

2047 Hunt, E. G. 



4 Project Physics Achievement 
Test, Units 2, 3. 

^ Ad hoc confidence scores 

4 Ad Hoc Midyear Achievement Test 

Jew York Regents Exam, Physics 

Dunning Physics Test 

4 ACS-NSTA Cooperative Exam, 
High §chool Chemistry 

Anderson-Fisk Chemistry 
Test, Form F 

4 Ad hoc cognitive test 

4 Ad hoc unit exams 

4 Ad hoc Physics Achievement Test 

Purdue Master Attitude Scale 
for Measuring Attitude Toward 
Any School Subject, Form B 

4 Ad hoc chapter tests 

4 Cooperative Biology 

4 T0US 

Watson-Glaser Critical 
Thinking Appraisal 

4 Ad hoc cognitive tests 

4 Ad hoc problem solving test 

Ad hoc cognitive tests 

4 Ad hoc achievement test 

4 T0US 

Allen Attitude Inventory 

Ad hoc achievement test 

4 California Survey Test in 
Physical Science 

id 



2/78 



2048 Alcorta, L. fi. -4 



ERIC 



2049 Aaron, G. 4 

2050 Love, G. H. 4 

2051 Brown, F. K. , & D. P. Butts 1 

2052 Raghubir, K. P. 1 

2053 Long, J. C, J. R. Okey, & 1 
R. H. Yeany 

2054 Martin, W. J., & P. E. Bell 1 

2055 . Study deleted 

2056 Toohey, J. V\ 1 

2057 Welliver, P. W. 1 



2058 Gallagher, J. J, 



f \ 1 
*J l. 



Iowa Test of Educational 
Development #6, #2 

Brown-Holtzmann Survey of 
Study Ha bats 

STEP, Forms 3A, 3B, 2A, 2B 

Nelson Biology Test 

Anderson Chemistry Test 

Ad hoc achievement test 

Facts About Science Test, 
A and B 

Ad hoc cognitive test 

Stanford Science Achievement 
Test, Form X 

Ad hoc cognitive test 

Ad hoc achievement test 

Ad hoc achievement test 

Ad hoc evaluations, lab 
skills and affective 

Ad hoc achievement test 

Ad hoc Physical Science 
Achievement Test 

TOUS 

Ad hoc Science Current 
Events Test 

STEP Science, Form 3A 

Thurstone Interest Scale 

Coded videotapes 

rt 

Ad hoc test of interaction 
recognition 



279 



2059 



Anderson, C, & D. Butts 



Ad hoc achievement test 



2060 



Anderson, R. D., & A* R« 
JThompson 



2061 

2062 
2063 
2064 

2065 



2066 



2067 
2068 



Netburn, A. N. 

Cowan, P. J, 
Siddiqi, M. N. 
Galey, M. 

Tucker, J. L. 
Fulton, H. F. 



4 
4 
4 



Wash, J. A. 4 
Pella, H. 0,, & C. Poulos 1 



Ad hoc Attitude to Science 
questionnaire 

Ad hoc achievement test 

Stanford Achievement Test 

Boulder Test of Creative 
Thinking 

Boulder Test of Critical 
Thinking 

Ad hoc cognitive test 

Time spent on each lesson 

PSSC tests, 1-5 

PSSC Tests, 1-5 

Ad hoc achievement test, 
performance interview 

Ad hoc Picture Test for 
Science Processes 

Ad hoc Science Concepts Test 

BSCS Final Exam, 1964 

Nelson Biology Test, Form E 

TOUS, Form W 

F.A.S. 

Watson-Glaser Critical 
Thinking Appraisal, Form Zm 

Silance Aii^+ude Scale, 
Form A 

Prouse Subject Preference 
Survey 

Ad hoc achievement test 
Ad hoc Biology Exam 
Coop. Biology Exam 



ERIC 



280 



2069 
2070 



2071 



2072 
2073 

2074 
2075 



2076 
2077 
2078 

2079 
2080 



Sqtman, F. X. , & M. Yost 

Hedges, W # D., & M. A. 
MacDouejal 1 



'Garside, L. J. 



Anderson, K. E. , F. S. 
Montgomery, & R. W. Ridgway 

Anderson, K. E., F. S. 
Montgomery* & H. A. Smith, 
& D. S. Anderson 

Anderson, K. E., & F. S. 



Jacobs, H. N. , & J. K. 
Bollenbacher " 



Jacobs, L . C. 
Strehle, J. A. 
Walker, M. A. 

Przekop, L. R. 

Popham, W. J., & J. M. 
Sadnavitch 



Al so included in #2080: 
Sadnavitch, Popham, & Black 



1 Ad hoc unit tests 

1 STEP Science Achievement, 
Forms A and B 

Cal ifornia Interest 
Inventory 

4 Ohio Physics Test 

Wisconsin Physics Final Test 

Physics Accumulated Test 

1 Minnesota State Board 
Exam in Biology, 1947 

1 Nelson Biology Test, Forms 
Am, Bm / 



1 Dunning Physics Test, 
Forms Am and Bm 

1 Coop. Biology Test, 
Forms X, Y 

Test of Knowledge About 
Science and Scientists 

4 Ad hoc achievement tests 

4 Ad hoc achievement test 

4 Otis-Lennon Mental Ability 
Test, Elenu II Level, Form K 

4 Ad hoc achievement test 

1 Coop. Physics Test, 1950 
Coop. Chemistry Test 
Thurstone Interest Schedule, 
1947 

Scale for Measuring Attitude 
Toward Any School Subject 

1 Coop. Physics Test 

Coop, Chemistry Test 



o 

ERIC 



3! J 



28L 



2081 



2082 

2083 
2084 

2085 

2086 

2087 
2088 

2089 



2090 
2091 



2092 

2093 
2094 



Pel-1 a-,~M ,~0 -J.._Stanl.ey.,- 
C. A. Wedemeyer, & 
W. A.'Wittfch 



Allison, R. W. 



Dilorenzo, L. T., & 
J. W. Halliwell 

Crabtree, J. F. 



Hug, W. E, 
O'Brien, S. J, 

Troost, C, J,, & S. Morris 1 
Beets, M. M. 4 

Beisenberz, P. C. 4 
i 



Champa, V, A. 
Grassell , E. M. 



Boblick, J. M. 

Boblick, J. M. 
Wickline, L. E. 



J Wisconsin Phy„stc_s „T_est__... 

Ohio Physics Schol. Test 

Ad hoc affective 

4 Adaptation of Allen Inventory 
of Attitudes Toward Science 
and Scientific Careers 

1 Metropolitan Achievement 
Test 

1 Ad hoc achievement test 

Time on task 

1 Comprehensive Final Exam* 
in First Year Biology 

4 Childhood Attitude Toward 
Problem Solving 

Ad hoc cognitive measure 

Creative Thinking Test 
(.adapted for elementary! 

4 Picture Test for Science 

Processes, Grades 1 & 2; 3&4 
(local) * 

Science Concepts Test, 
Grades 3 & 4 (local ) 

4 Coop. Science Test 

4 STEP Physics Test 

Dunning Physics Test 

Ad hoc achievement tests 
1 Ad hoc achievement test 

Time on task 
1 Ad hoc achievement test 
4 Allen Attitude Scale 

Facts About Science Test 



ERIC ' . 



3 J. J 



s 



282 



2095 
2096 



2097 
2098 



2099 
21 00 



2101 
21 02 
2103 
21 04 

2105 
21 06 

21 07" 
21 08 
21 09 



Nordland, F. H. , J. B, Kahle, 1 
S. Randak, & T. Watts 

Penick, J. E., D. Scfilitt, 1 
'S. Bfender, & 0. Lewis 



Kahle, J. B. , F. H. Nordland, 1 
and C. B. Douglass 

Penn, R. F. 4 



/ 



Johnson, L. 
May, J. 

1 

Richard, P. W.« 

Kline, A. ..A. . 

Fiel , R. L., & J. R. Okey 

Zeschke, R. 

♦ 

* 

Crocker, R. K. , et al . 
Black, W. A., et al . 

Yarber, W. L. 

Patterson, M. D. 

Monaco, W. J., & M. Szabo 



4 
1 



1 
1 
1 
1 

5 
5 



Ad hoc unit tests 



Torrance Test of Creatiye 
Thinking CFigural 
Creativity; Vferbal 
Creativity) 

Ad hoc achievement test * 



ACS/CHEM test 
TOUS 

Cornell Critical Thinking 
Test, Form Z (.1961) 

Ad hoc achievement test 

ACS-NSTA Test, form 1971" 

ACS-NSTA Test, Form 1970 
advanced 

1 BSCS Achievement Test 

Ad hoc achievement test 

Ad hoc achievement test 

Ad hoc achievement test 

Time on task 

Ad hoc achievement tests 

Cooperative Chemistry Test 

Cooperative Phyics Test 

A Venereal Disease Knowledge 
Inventory 

Bristol Study Skills 
(abbreviated) 

Stanford Achievement Test, 
Science Sub-score % 



V 



9 

ERIC 



3V6 



283 



2110 



2111 



2112 



2113 



2114 



2115 



2116 
2117 



2118 

2119 
2120 
2121 



Linn, M. C, B. Chen, 
& H. D. Thier 



Fritz, o, 0, 



Gar^y, R, J., H, J, Dfet- 
meyer, M. Kraft, & 
A. C, Sheehan 



Winter, S. S., S. D. Farr, 5 
J. J. Montean, & J. A. Schmidt 



Vandermeer, A. W. 



Robinson, D. B. 



Denton, J. J., & F. J. Gies 6 
Swanson, 0. H. 6 



Strevell, W. H. 5 

Nelson, C. H. 1 

Wade, S. E. 1 

McCollum, T. E., 1 



Science Process Test — 
Variables; Experimentation 

Interviews 

Ad hoc achievement test 

Adapted Allen Attitude Scale 
Toward Scfence and Scientific 
.Careers 

Ad hoc science information 
test 

Science Reasoning test 
(local ) 

STEP Science Test 

New York Regents Exam, 
Chemistry 

'Science Reasoning Test 

Kuder Preference Inventory 

Calvert Science Information 
Test, Intermediate Form B 

Ad hoc unit tests 

New York Regents Exam, 
Biology, 1967 

Nelson Biology Test 

Number of objectives achieved 

Adaptation of previous 
-Regents Exams, Chemistry 

Number of objectives mastered 

Dunning Physics Achievement 
Test 

Ad hoc achievement test 
Ad hoc achievement tests 
Achievement tests 



o 

ERIC 



284 



ERIC 



2122 



2123 



2124 



2125 



2126 



2127 



2128 



2129 



2131 



Noall, M. F., & L. Winget 



Glass, L. W., & R. E, Yager 1 



Simmons, J. B., W. J. Davis, 
G. C. Ramseyer, & 
J. J. Johnson 

Anderson, K. E., F. S. 
Montgomery, & S, F. Moore 



Jerkins, K. F. 



Lee, J. E. 



Shinfeld, S. L. 



Moore, W. J. 



2130 Martinez-Perez, L. 



Hughes, W. R. 



I 

Coop. Physics Test, 
Forms X, Y 

Strong Vocational Interest 
Blank (Attitude to Science) 

TO US 
FAS 

BSCS Standardized Biology 
Achievement Test 



Anderson Chemistry Test, 
Form Am 

ACS-N3TA Chemistry Exam, 
Form 1959 

Ad hoc lab skills test 
TO US 

c 

Coop. Science Test 
MP AT I 

i 

Ad hoc biology achievement 
test 

Attitude Toward Subject 

Sel f Concept of Abil ity 
in Science 

ACS Chemistry Test 

Ad hoc achievement tests 

Silance: Attitude. Toward 
Any School Subject 

Processes of Science Test 

Final Comprehensive Exam 
[BSCS Patterns & Processes) 

Piers-Harris Children's 
Self-Concept Scale 

Attitude Toward Science 
Inventory (modified) 

Ad hoc Processes of Science 
Test; Content Examinations. 



3 1 ,' 



285 



ERIC 



REFERENCES 



Block, J. H. Mastery learning theory and practice . New York: Holt, 
Rinehart & Winston, 1971 . 

Carmichael, J. W. , Jr. General chemistry by PSI at a Minority Institution. 
Journal of Chemical Education , 1976, 53(12) :791 -2. 

Emerson, David W. Teaching organic chemistry by a modified Keller plan. 
Journal of Chemical Education , 1975, 52(4) :228-9. 

Good, Carter V. (Ed.). Dictionary of educatio n. New York: McGraw-Hill, 
1973. ■ 

Hartley, J. R. Computer assisted learning in the sciences: some progress 
and some prospects. Studies in Science Education , 3(1 976) :69-96. 

Herring, J. Dudley, Harold H.- Jaus, Van Neie, Thorn Luce, & terry O'ftfr*on. 
A summary of research in science teachi ng— 1974 . Purdue University. 
(ERIC Document Reproduction Service No. ED 116 ?55) 

Jackson, Gregg B. Methods for reviewing and integrating research in the 
social sciences . The George Washington University, Social Research 
Group, 1978. 

Kulik, James A. and Kulik, Chen-Lin C. "College Teaching" in Peterson P.l. 
and Walberg, H.J. (ed.) Research on Teaching - Berkeley, California-' 
McCutchon Publishers, 1979. 

Kuska, Henry A. Use of the Keller plan in a freshman chemistry trailer 
course. Journal of Chemical Education , 1976, 53(8) :505. 

Marchese, Richard W. A literature search and review of the comparison of 
individualized and conventional modes of instruction in science. 
School Science and Mathematics , 1977, 77(8) :699-703. 

Mel ton, Raymond G. Summary of the findings of three NSF sponsored studies 
on the status of precollege science .education. In National Science 
Foundation, What are the needs in precollege science, mathematics, 
and social science education? Views from the field . Washington: 
Government Printing Office, 1980. 

Nordland, Floyd, Jane Kahla, & Hugh Via. A practical approach to an audio- 
tutorial system. School Science and Mathematics , 1972, 72(8) :673-678. 

Novak, Joseph D. A summary of research in science education for 1972 . 
Ithaca, NY: Cornell University, 1974. (ERIC Document Reproduction 
Service No. ED 090 055) 

Palladino, George F. General chemistry: an alternative to PSI for advanced 
y r - students. Journal of Chemical Education , 1979, 56(5) :323-324. 



aid 



286 



Postlethwaite, S. N . , J. Novak, & H. Murray. An integrated experience 
approach to learning . Purdue University, Lafayette, Indiana. 

— Minneapolis: Burgess, 1964. CERIC Document Reproduction Service 

No. ED 010 765) . 

Ramsey, Gregor A., & Robert W. Howe. An analysis of research on instruc- 
tional procedures in secondary school science, part II, instructional 
procedures. Science Teacher , 1969, 36(4):72-91. 

Schramm, Wilbur. Programed instruction today and tomorrow. Reprinted in 
Four Case Studies of Programed Instruction . New York: Fund for the 
Advancement of Education, 1964. 

Silberman, Harry F. Characteristics of some recent studies of instruc- 
tional methods. I r> John E. Coulson (Ed.), Programmed Learning and 
Computer-Based Instruction . New York: John Wiley and Sons, 1962. 



Smith 

course 



Homer A., Jr. The evolution of a self-paced organic chemistry- 
urse. Journal of Chemical Education , 1976, 53(8) : 51 0-51 2 . 

Weiss, Iris R. 1977 national survey of science, mathematics and social 
science education highlights report. In The status of pre-colleqe 
science, mathematics, and social studies educational practices in 
U.S. schools: An overview and summaries of three studies . Washing- 
ton: National Science Foundation, 1978. 



ERIC JJJ 



THE EFFECTS OF VARIOUS SCIENCE TEACHING STRATEGIES ON 



ACHIEVEMENT 



Kevin C. Wise 
James R. Okey 

University of Georgia 
Athens, GA 30602 



3oJ 



ERIC 



4&8 



TABLE OF CONTENTS 



•.Purpose of Study 

Definition of Teaching Techniques 

*. . >> 
o 

N N Audio-visual 
Focusing 
Grading 

Inquiry-Discovery^, 

Manipulative 

Modified 

Presentation Mode 

Questioning 

Teacher Direction 

Testing 

Wait-Time 

Miscellaneous 
Procedures 

Selection of Studies 

Data Sources 

Coding the Studies 

The Coding Process 
Results 

Descriptive Data 

Effect Size Data 
Interpretation and Implications 
References 



290 
291 
292 
292 
293 
293 
293 
294 
294 
294 
295 
1295 
295 
296 
296 
296 
298 
299 
300 
302 
302 
309 
318 
321 



9 

ERIC 



3? 



289 



LIST OF TABLES 



TABLE PAGE 

1 Mean Effect Sizes Obtained for Cognitive and Other 310 
Outcomes Using Different Teaching Strategies v 

2 Mean Effect Sizes Obtained With Students at 311 
Different Grade Levels 

3 Mean Effect Sizes Obtained in Classes of Different 312 
Sizes V* 

4 Mean Effect Sizes Obtained in Studies Focusing on ' 313 
Difference Academic} Areas in Science 

5 N^fean Effect Sizes for Various Types of Literature 314 
Reports 



Mfec 



>an Effect Sizes Obtained in Studies With Different 315 
Number of Students Involved 

\ 

Mean Effect Sizes for Studies Involving Different 316 
Numbers of Teachers *\ 

Mean Effect Sizes for Studies Conducted for Different 317 
Amounts of Time 




290 

\ 



THE EFFECTS OF VARIOUS SCIENCE 
TEACHING STRATEGIES ON ACHIEVEMENT 



Kevin C. Wise and James R. Okey 
University of Georgia 
Athens, GA 30602 



Purpose of the Study 

The purpose of this study was to synthesize findings of the effects 
on science achievement of various teaching strategies using the procedures 
of meta-analysis (Glass, 1976, 1978). This is one of seven areas of science 
education research selected for study in the University of Colorado Science 

The 

Meta -Analysis Project, 

seven areas had been selected by the Colorado project, in consultation with 
a national panel, as representing significant blocks of findings of sufficient 
importance and scope to justify an integration of the literature. 

Numerous studies dealing with the effects of various science teach- 
ing techniques exist in a variety of documents. The integration and 
interpretation of this aggregate research on the topic of science teaching 
.techniques cannot be handled sufficiently through narrative means alone. 
Chronologically ordered verbal descriptions may be sufficient when only a 
few studies are related to a topic. When tens or hundreds of studies are 
involved, however, a narrative -approach fails to accommodate the accumulated 
knowledge. 

If the findings of many studies are regarded as data points and 
addressed in a statistical manner, they can be integrated using meta- 
analysis. "Through this technique information can be compiled from many 
studies that when taken as a group in the narrative sense appear inconclusive 



and even incomprehensible. 

ERIC 



O n ■ i 



0 



291 



The techniques of meta-analysis have been extensively described 
elsewhere (Glass, 1978; Glass, McGaw, White, & Smith, 1980). Integration 
of research findings have been done in areas such as class size and achieve- 
ment (Glass & Smith, 1979), diagnostic remedial instruction in science 
(Yeany & Miller, 1961), and attitude and achievement in science (Willson, 
1981) . In its most basic form the procedure of meta-analysis involves 
determining the difference between experimental and control group mean 
scores in standard deviation units (called an effect size). The impact of 
a technique (such as a particular teaching strategy) in standard score 
units can then be examined across a variety of studies. 

In this meta-analysis study, the impact of twelve categories of 
teaching techniques were examined. In addition to calculating the size of 
the effect in each study, information was collected about student and 
teacher characteristics, details of the treatment, and experimental con- 
ditions. The purpose of collecting this contextual information was to 
determine the circumstances in which the teaching strategies had their 
influence. By crosstabulating various features of studies (e.g., size of 
class or grade level of subjects) with the effect size a picture of the 
conditions under which a teaching strategy has maximum or minimum impact 
begins to emerge. * ' 

Definition of Teaching Technique s 

Since the purpose of this stuuy >-is to determine the impact of 
various teaching techniques or methods on science achievement, it was 
important to define what the techniques were. This was necessary in order 
to select appropriate studies and clearly communicate results. It was 
also necessary because another group in the Colorado project was analyzing 



292 



what, was referred to as instructional systems. An initial definition 

provided by the project staff was as follows: 

Teaching methods are thought of as narrower, less encompassing 
than instructional systems* Whereas the latter might plausibly 
guide a great many decisions about the organization, and conduct 
o^ teaching a science course, teaching methods refer to more 
limited aspects of a teaching plan (e.g., the method of testing, 
type of questioning, wait -time and the like). Studies in which 
teaching methods are evaluated are typically of short duration 
and limited to one or two narrow topics* 

Tnis definition of teaching technique was used to define twelve 

categories of teaching techniques . They are audio-visual, focusing, grading, 

inquiry, manipulative, modified, presentation approach, questioning, teacher 

direction,, testing, wait-time and miscellaneous. Each of these categories 

will be briefly discussed with representative examples included. 

Audio-visual 

Although the bulk of the media -based instruction was considered 
to be under the domain of instructional systems, some was limited enough 
in scope or duration to be appropriately considered as a teaching technique. 
Examples of experimental A-V teaching techniques that were compared with 
control methods are: 

1. Films on a specific topic 

2. Videotaped presentations 

3. Audio-taped directions 

A. Supplemental pictures, photos, or diagrams. 
Focusing 

Teaching techniques included in this category include those where 
something occurs to alert students to the objectives or intent of instruc- 
tion. Focusing techniques may be employed before, during or after instruction 
General examples of these includes 



293 



1. Students provided with objectives 

2. Objectives reinforced at different points during instruction 

3. Various organizers of instruction. 

Grading 

■ Experimental techniques 'included here involve changes in the grading 
system that the researcher has reason to suspect may result in improved 
student performance. Specific examples are: 

1. Use of pass/fail graining 

2. Students assigning their own grades. 

Inquiry-d iscovery 

In general the teaching techniques that involved more student- 
centered less step-by-step teacher directed learning experience are included 
in this category. Very often the techniques were identified by the authors 
of a research report as being inquiry or discovery. In nearly all cases 
these techniques are compared with a control method identified as being 
"traditional," "expository" or "conventional." Examples of the techniques 

used are; 

1. Inquiry lessons 

2. Guided discoveries 

3. Inductive laboratories. 

Manipulative 

Students operate, handle or in some way work or practice with 
physical objects as part of the instructional process. Generally a single 
device or kind of manipulation is involved in this group of techniques. 
Examples are: 



ERIC 



326 



294 



1. Operation of a specific piece of apparatus 

2. Physical practice of some sjcill 

3. Sketching or drawing 

4. Constructing something. 

Modified 

Studies in which a researcher changes a single portion of instruction 
to test for Improved student achievement. In almost all cases the modifica- 
tion or revision is of instructional materials. Examples include: 

1. Materials rewritten or annotated 

2* Directions presented other than by written word 

3. Change in laboratory equipment. 

Presentation Mode 

This broad category of techniques refers to the means of instruction 
where several changes in material have taken place, the setting of instruc- 
tion is different or student approach or introduction to a topic or teaching 
arrangements differ from what is considered a more traditional method'. 
Representative examples are: 

1. Field trips 

2. Group discussions 

3. Individual or self-paced lessons 
A . Games — simulations 

5. Team teaching. 

Questioning ^> 

Teaching techniques that involve the use of varying levels or 
position of questions in instruction belong to this group* Examples of 

q specific techniques found here are: 
:RJC oo » 



295 



1. Questions inserted in a film 

2. Knowledge and comprehension level questions at the start of a 
unit 

3. Questions before, during, or after an assigned reading 

4. Use of high level questions. 

Teacher Direction 

Variations in the extent to which the learning task was spelled out 
for the student typified teaching techniques classified under teacher 
direction. Specific illustrations Include: 

' 1. Students conduct experiments or activities given only sketchy 
direction 

2. Students select objectives and assumes responsibility for 
learning 

3. Indirect instruction. 

* 

Testing 

Techniques where tests were used- in various ways with a veiw toward 
improved student achievement. Usually this involved a change in the 
frequency of testing, the purpose of testing, or the level of the test 
^ans^The use of feedback is also included here. Examples are: 
^ 1. Formative testing 

2. Immediate or explanatory feedback 

3. Diagnostic testing and remediation 

4. Optional retesting 

5. Testing to mastery. 

Wait -time 

This category included studies- that used increased duration of 
wait -time. These can be identified as: 



ERIC 326 



296 



1. Long vs, short wait-time 

2. Added pauses at key response points* 

Miscellaneous 

They are all the other teaching techniques not classifiable into 
any of the previous categories. Examples include: 

1. Students performed extra experiments related to the topic of 
instruction on their own time. 

2. Students viewed a film more than one time* 

Clearly these categories are by no means an absolute system for 
classifying teaching techniques. Further it is evident that while some 
of the categories established are clearly defined, others do not lend 
themselves to precise specification. These categories were formulated and 
studies were placed in them after all reports had been coded. This allowed 
for careful consideration of each category and study In terms of all the 
documents reviewed. They represent the variety of means researchers have 
used to bolster science achievement by altering some aspects of the 
Instructional situation. Altered teacher behaviors, student actions and 
responsibilities, classroom materials and equipment, time, and testing are 
all included. 

Procedures 

Selection of Studies 

The literature base searched for studies relating teaching tech- 
nlques and science achievement included microfilmed dissertations, ERIC 
documents and reports, and the periodical literature. Studies selected 
for possible coding from these sources first had to have titles that 
Implied they dealt with vhat would be considered - teaching techniques. 



297 



The basis for this judgment was the definition of teaching techniques or 
methods provided by the meta-analysis project steering committee and further 
clarified through discussions at a training session held for all persons 
involved in the meta-analysis project. The contents of each study thus 
sleeted were then examined to confirm that the study was relevant. 

The sample of studies that were ultimately coded are further 
-described by the following items: 

!• Age of Subjects — The studies used mainly included students in 
grades 6 through college.. It was originally intended that studies with 
students in grades kindergarten through college would be involved. Part 
way through the coding process the decision was made to limit the sample to 
only those investigations using subjects in the ^college range. This was 
necessary to maintain the total coding task to manageable proportions. 

2. - Geography — Studies were limited to those written in English 
and reported in the United States. 

V 

3. Year of Publication — No study published earlier than 1949 
was' used. 

*4. Control Group Used — A control or contrast group identifiable 
as being traditional or conventional was necessary. 

5. Sufficient Data — Enough data was included in the report so 
that an effect size could be calculated and identified as being positive 
or negative. 

Notes were kept on the reasons individual studies were rejected. 
The predominant reason for disqualification was that upon examination of 

3Ji) 



the contents of studies it became apparent that they did not deal with 
experimental teaching techniques as earlier defined. The second most common 
reason for rejection was that not enough information was provided to allow 
for effect size calculation. Rather complete means have been developed 
(see Glass, McGaw, White, & Smith; 1980) to determine effect sizes even 
when treatment and control ueans and standard deviations are not available. 
For example, given values of £ and sample sizes it is possible to determine 
an effect size (see Appendix A for effect size calculation formulas). But 
even with these techniques insufficient information .would still sometimes 
halt the meta-analysis process. Studies were also rejected becau$e subjects 
were outside the 6-college age range or because there was no control or 
contrast group. 

Data Sources 

Research studies were examined that came from the following 
documents: 

1. Science education doctoral dissertations — these included all 
dissertations available on microfilm from the Ohio State ERIC 
Center that related teaching techniques and science achievement. 

2. ERIC documents and reports— these were identified by a computer 
search of the ERIC data base to identify reports of teaching 
techniques in science. 

3. Journal articles — these included searching for relevant 
studies in all issues of the Journal of Research in Science 
Teaching (1963-1981), all issues of the Journal o f College 
Science Teaching (1970-1981), all issues of Science Education 
from 1970 to 1981. 

There are certainly additional sources of research reports relevant 
to this study. But the sources examined should provide the bulk of the 
reports available and allow reasonable inferences to be made about the 
aggregate effect of various teaching strategies on science achievement. 



299 



Coding the Studies 

In order to insure that important variables from each study were 
examined and recorded in a consistent way, a suitable coding form had to 
be developed. This involved identifying potentially useful variables and 
incorporating them into a concise and easy to use format. 

Development of a coding form began at the training session in 
Colorado where discussion generated a number of the variables that were 
incorporated. These variables were then categorized and arrayed into a 
convenient format to produce the initial coding instrument. Revisions to 
the form occurred as the actual coding process proceeded. These modifica 
tions were made to best accommodate the studies being coded. 

A total of 76 variables were included on the final coding form and 
classified into major categories. Each of these categories is identified 
and briefly discussed. A copy of the coding form and information about 
each variable are included in this report (see Appendix B) . 

1. Report ID (3 variables). This group of variables provides a 
document -number and identifies the coder. 

2. Study Data (A variables). Variables here are used to account 
for single or multiple treatments and measures. The year and form of the 
study are also identified. 

3. Student Data (17 variables). Characteristics of study subject 
such as grade level, SES, and number of students involved are documented 
by this group of variables. 



on ) 
O O ^ 



300 



A. Teacher Data (13 variables). Teacher characteristics such as 
gender, age, educational background and experience teaching are part of 
this category. 

5. Context Characteristics (3 variables). The size of the. classes 
and the schools involved as well as the types of communities are accounted 
for here. 

6. Design Characteristics (6 variables). Experimental design 

E 

' considerations such as means of assignment of subjects and teachers are 
■V observed. 

7. Treatment (16 variables). The specific types of teaching 
techniques and the roles of the teachers and students as well as the dura- 
tion of the treatment are among the variables that are parts of this 
category. 

8. Outcome Characteristics (A variables). Variables considered as 
outcome characteristics involve the kinds of measures used and the relia- 
bility and reactivity of each. 

9. Effect Size Calculation (10 variables). The kind of data used 
to calculate effect size and the study effect sizes are included in this 
group of variables. 

The Coding Process 

The first group of studies coded were the microfilmed dissertations. 
These were selected by Colorado on the basis of title and provided by the 
ERIC Center at Ohio State. The contents of each of the more than 300 



301 



dissertations were examined locally to determine if they actually dealt 
with teaching techniques as earlier defined. The studies were further 
screened to include only those using appropriately aged subjects, a 
traditional or conventional control group, and sufficient data to calculate 
effect size. 

The second group of studies coded were ERIC documents. Abstracts 
of some 2,000 ERIC available science studies were provided by Colorado. 
These were reviewed to identify those that appeared relevant tp the teach- 
ing techniques question. Copies of the studies thus selected were obtained 
locally on microfiche and screened like the dissertations to determine which 
ones would actually be coded. 

Finally on an issue-by-issue basis the journals were scanned. 
These were all issues of the Journal of Research in Science Teaching , the 
Journal of College Science Teaching and Science Education from 1970 to the 
present. Studfes were selected for coding in the same manner used with 
dissertations and ERIC documents. 

Once it had been determined that a particular* study was useable it 
next had to be coded. This involved reading the study and locating or 
determining values for as many of each of the 76 variables specified on 
the coding form as possible. In cases such as student grade level, or 
measure reliability, variable values were stated in the study. Values for 
other variables, like SES, were inferred when sufficient evidence was 
available. In the case of study effect size, the value had to be calcu- 
lated in every instance. 

In this meta-analysis study effect size is a standard measure of 
the difference between an experimental teaching technique and a traditional 
method,. The vast majority .of effect si2es were computed using mean** and 



302 

standard deviations. To compute effect size when comparing an experimental 
teaching technique to a control method, the mean of the control group is 
subtracted from the mean of the experimental group and then divided hy the 
standard deviation of the control group. An effect size is designated as | 
negative when the mean score of the experimental group is lower than that 
of the control group. A. single study results in multiple effect sizes when 
there is more than one experimental treatment or when there are two or more 
post measures. 

Data from the coding fotms were key punched (see Appendix C for the 
layout of the 76 study variables on the computer cards) and analyzed using 
the Statistical Package for the Social Sciences (Nie, et al., 1975) on an 
IBM 360 computer. The analysis included descriptive statistics for all 
continuous and categorical variables, categorizing some variables originally 
coded as continuous, and cresstabulating a number of study characteristics 
( e «g*f grade level of subjects) with effect size. 

Results 

The computer analysis of the 76 variables considered in the study 
produced the result3 that follow. Note throughout this nortion of the 
report that many of the variables are not mentioned primarily because there 
was limited information available in the research documents. For example, 
the educational background of teachers was reported in fewer than 30% of the 
studies and is not discussed here. 

Descriptive Data 

Research studies of the effects of various teaching strategies were 
examined that covered approximately the last 30 years. Figure 1 shows, the 



303 



distribution of effect sizes associated -with- different spans of years. 
The early 1970' s produced the largest group of effect sizes (40% of the 
total) , 



CO 

W 
M 
M 
CO 

H 
O 
W 

O 
U 

2 
W 

o 
w 



50 - 



40 



30 



20 



10 - 



Before 
1960 



1960- 
1964 



1965- 
1969 



1970- 
1974 



1975 to 
Present 



YEAR OF STUDY 



Figure 1. Percentage of effect sizes represented by different time periods 



ERIC 



304 



The studies selected for the meta-analysis were conducted with 
students primarily from grade 5 through the early college years. Figure 2 
shows the percentage of effect sizes associated with each span of grade 
levels. 



m 
W 

M 

m 

B 

w 

lu 
W 
U. 

o 

w 
o 
<: 

H 
W 



50 - 



AO - 



30 - 



20 - 



10 - 



Elementary 
(Through 
Grade 5) 



Middle School 
(Grades 6-8) 



High School 
(Grades 9-12) 



Post High 
School 



GRADE IN SCHOOL 



Figure 2. Percentage of effect sizes represented by students at different 
grade levels • 



305 



The traditional areas of study in science are represented among the 
research reports included in the teaching strategies meta-analysis. Figure 
3 shows the percentage of effect sizes for each science subject area. 



«7) 
W 

H 

CO 

H 
U 
W 

U 

Ci< 
O 

W 
O 
< 
H 

z 
w 
u 

ct; 
u 



30 - 
25 
20 
15 - 
10 - 
5 - 



Earth Chemistry Physical General Biology Other 
Science Science Science 



SCIENCE SUBJECT AREA 



Figure 3. Percentage of effect sizes represented by different subject 
areas. 



ERLC 



3 



306 



c 

Outcome measures used in the studies of teaching strategies ranged 
from traditional cognitive tests to interview techniques. Figure 4 shows 
the distribution of outcome measures* 



60 - 



to 
w 

M 

in 
H 

fa 
fa 
W 

fa 

o 

fa 
o 

2 



50 - 



AO - 



30 - 



fa 
fa 



20 - 



10 - 



Processes/ Affective Problem 
Methods of Measures Solving/ 
Science Creativity 



Cognitive 
Achievement 



Other 



TYPE OF OUTCOME MEASURE 



Figure 4» Number of effect sizes associated with different types of 
outcome measures. 



307 



Additional data that provide information about the contexts in 
which the selected studies were conducted are given below. 

1. The effect sizes came from three types of reports: 
Dissertations 56% 
Journal Articles 26% 
Unpublished Papers 18% 



I 

2. The research studies were conducted in different types of 
cojnmunity settings. 

• Rural-town 17% 

I 
I 

| Suburban 35% 

I 

I Urban 20% 
Not Classified 29% 

3. frhe instruments and methods used to measure outcomes were 
^distributed as follows: 

Instruments developed 

especially for the study 38% 

Published instruments 31% 

Regular classroom tests 20% 

Observations and interviews 8% 

Other 3% 

4. The number of subjects included in the various studies were 
distributed as follows: 

50 or fewer subjects 12% 

. 51 - 99 28% 

100 - 199 30% 

200 or more subjects 27% 

Unknown 3% 

ERlC 3-vj] 



308 



The number of teachers involved ii 

1 or 2 teachers 
' 3-8 

9 or more 
Unknown 

J 

The teaching strategies were used 
/distributed as follows: 

{ 

Fewer than 15 students 

15-24 

25 - 34 

35 or more students~~~~~ 
Unknown 

The studies in which the teaching 
conducted over varying lengths of 

2 hours or less 
3-10 

11 - 20 

More than 20 hours 
Unknown 

The reliability of the criterion 
as follows: 

.69 or less 

.70 - .89 

.90 - l.OO 

No information given 



the studies are as follows: 
283 
217. 

97. 
1*27. 

in classes of various size 

87. 
297. 
287. 

97. 
267. 

strategies were used were 
time. 

15Z 
197. 
47. 
,. 327. 
30Z- 

ueasures used were distributed 

6% 
36% 
107. 
48% 




309 



- N ^ 

Effect Size Data 

A total of 160 studies were coded resulting in All effect sizes. 

ort J/ okW variaUv ,34 
The overall mean effect size, in this analysis isA.*tt £ fo *£ 1 *, jM1 / <jr f&z.'b *>"W ^ = ' 30 . 

teaching strategies). The average impact- of using one or the teaching 

strategies analyzed in this report, therefore, was to increase achievement 

by about one-third of a standard deviation. In terms of percentiles, the 

mean effect of using the teaching strategies was to increase scores by 

I 

about 13%. 

Not all teaching strategies had the same impact on achievement. 
Table 1 gives the mean effect size for the 12 categories of teaching 
strategies used in this analysis. More confidence can be placed in some 
of these compared to others because of the number cof effect sizes repre- 
sented by each mean score. « 

Mean effect sizes were calculated for studies conducted wich students 
of different grade levels. Table 2 provides mean effect size information 
for the four categories of grade levels used. 

This meta-analysis of the effects of teaching strategies provided 
some additional data concerning learning in classes of different size. 
Table 3 presents findings similar to those of Glass and Smith (1979) in 
their study of class size and achievement. The largest effects of the 
different teaching strategies are associated with the sma^ejt class . 
size. 

Different mean effect sizes were found in the different academic 
areas represented by the studies in the analysis (see Table 4) . 




Mean Effect Sizes Obtained for Cognitive and Other Outcomes 
Obtained Using Different Teaching Strategies 



Type of 


Cognitive* 




Other** 




\ rn _ A. 1 

Total 




/• or 


Strategy 


X SD n 


X 


SD 


n 


X SD 


n 


All Cases 



















Wait-Time 


.53 


.02 


2 


1.27 


.00 


2 


\.90 


.43 


4 


1 


Focusing 


.48 


.90 


25 


1.37 


.63 


3 


.57 


.91 


28 


7 


Manipulative 


.56 


.64 


24 








.56 


.64 


24 


6 


Modified 


.55" 


.45 


20 


.27 


.34 


2 


.52 


.45 


22 


5 


Questioning 


.56 


.37 


11 


.07 


.06 


2 


.48 


.39 


13 


3 

c 


Inquiry-Discovery 


.41 


.87 


38 


.15 


.29 


20 


.32 


.73 


58 


15 


Testing 


.37 


.49 


33 


.14 


.34 


11 


.32 


.46 


44 


11 


Presentation Mode 


.24 


.54 


77 


.29 


.62 


26 


.26 


.56 


103 


26 


Teacher Direction 


1 £ 




OA 
CO 




R1 




9T 


66 

• \J\J 


45 


11 


Audio-Visual 
Methods 


.16 


.49 


30 


.33 


.35 


3 


.18 


.48 


33 


8 


Grading 


-.13 


139 


13 


-.40 


.00 


1 


-.15 


.38 


14 


4 


Miscellaneous 


.53 


.24 


8 


.23 


.19 


4 


.43 


.26 


12 


3 


Total 


.35 


.64 


309 


.30 


.61 


91 


.34 


.63 


400 


100 



*Ihe Cognitive category includes low and high level outcomes, general achievement f^problem solving. 
**The "Other" category includes critical thinking, creativity, logical thinking and affective measures. 



9 

ERIC 



34., 



311 



Table 2 



Mean Effect Sizes Obtained with Students 
at Different Grade Levels 



Grade Level 
of Student 


Mean Effect 
Size 




Number of 
Cases 


7. of 
Cases 


Elementary 

i through Grade 5) 


y 

.36 


•71 


50 


13 


Middle School 
(Grades 6-8.) 


.30 


.74 


93 


24 


High School 
(Grades 9-12) 


.25 


.53 


164 


43 


Post High School 


.42 


.70 


77 


20 



i 

ERIC 



312 



Table 3' 



Mean Effect Sizes Obtained In Classes 
of Different Size 



Mean Effect Number 
Size of Class Size SD Case 

Fewer than 15 Students 
15 - 24 
25 - 34 

35 or more Students 



.74 .86 32 

.37 .60 * 119 

.23 .46 114 

.23 .57 38 




£13 



Table 4 



Mean Effect Sizes Obtained in Studies Focusing 
on Different Academic Areas in Science 



Area of Study 



Number of 
SD Cases 



Physical Science 
General Science 
Biology 
Chemistry 
Earth Science 



ERIC 



314 



Another means of examining mean effect sizes is by the source of 
the literature report. Table 5 gives mean effect sizes for the three types 
of reports examined in this analysis. 



Table 5 



Mean Effect Sizes for Various Types of 
Literature Reports 



Mean Effect Number of % of 

Type of Report Size SD Cases Cases 



Journal Articles 



Dissertations 

Unpublished 

(ERIC Documents) 



.41 .67 105 26 

.32 .66 230 56 

.30 .51 74 18 



ERIC 



« > * 



315 



The studies reviewed for this analysis included widely differing 
numbers of students. Table 6 provides mean effect sizes associated with 
the studies of various size. 



Table 6 



Mean Effect Sizes Obtained in Studies with 
Different Number of Students Involved 



ERIC 3* j 



Number of Subjects 
in Study 


. Mean Effect 
Size 


SD 


Number of 
Cases 


7. of 
Cases 


0-50 


.66 


.90 


49 


12 


51 - 99 


.41 


.71 


115 


29 


100 - 199 


.35 


.53 


125 


31 


200 or more 


.09 


.38 


110 


28 



316 

© 

* 

The average effect sizes associated with studies using different 
numbers of teachers is shown in Table 7. 



Table 7 



Mean Effect Sizes for Studies Involving 
Different Numbers of Teachers 





1 

Number of Teachers 


Mean Effect 
Size 


SD 


Number of 
Cases 


7. of 

Cases" 


1-2 


.41 


.70 


116 


28 


3-8 


.35 


.56 


86 


21 


9 or more 


.20 


.30 


36 


9 


Unknown 






173 


42 



317 



The studies analyzed were conducted over widely differing amounts 
of time. Some were done in only a class period or two while others lasted 
for several months. The information in Table 8 shows the mean effect sizes 
associated with four categories of study time. 



Table 8 



Mean Effect Sizes for Studies Conducted for 
Different Amounts of Time 



Duration of Study 
(Hours) 


Mean Effect 
Size 


SD 


Number of 
Cases 


7. of 

Cases 


0-2 


.44 


.84 


63 


15 


3-10 


.43 


.71 


77 


19 


11 - 20 


.20 


.36 


16 


4 


More than 20 


.33 


.57 


132 


32 


Unknown 






123 


30 



318 



Interpretation and Implications 

What conclusions should be drawn from an integration of the research 
on teaching strategies in science that produces an overall effect size of 
about one-third of a standard deviation (.336)? Are alterations in such 
things as teacher questions or directions, student activities, classroom 
materials, tests or grading practices worth the effort when they result in 
student scores that are on the average 13 percentile points higher than in 
the unaltered classes? These questions are pmbably unanswerable and per- 
haps even unimportant unless one is concerned about the overall impact of 
innovations. Most often teachers, teacher educators, researchers, or 
instructional developers have interest in a certain instructional strategy. 
Thus their concern is with the impact of a specific teaching strategy and 
not the effect of all teaching' strategies. Even the clumping of strategies 
into 12 categories as has been done in this report makes it difficult to 
determine the impact of a teaching strategy such as "providing students 
with instructional objectives" because it is lumped with all other 

Focusing strategies. 

The picture provided by this meta-analysis of teaching strategies 
is a macroscopic view. It provides evidence on the general, overall 
impact of a category of strategies but does not give fine, detailed micro- 
scopic information on a particular strategy. 

The information in Table 1 shows a range of effect sizes from .90 
(Wait-Time) to -.15 (Grading). Recent reviews of instructional research 
(e.g., Rosenshine, 1979) have concluded that direct teaching strategies 
have greater impact than indirect ones. Is that conclusion supported by 
this analysis? There appears to be some support among the strategies with 



- 319 



relatively large effect sizes; higher than the mean are Focusing and 
Questioning and amoug those with relatively small effect sizes are Inquiry- 
Discovery and Teacher Direction. Wait-Time strategies have the largest 
impact but they also account for the fewest number of studies reviewed in 
any category. 

The effect sizes associated with classes of different size (see 
Table 3) should provide strong evidence for policy makers who advocate 
smaller classes. In this case the accumulated research results confirm 
teachers contentions. 

Research methodologists can glean items of some interest from this 
report. Studies involving the fewest number of subjects produced the largest 
effect sizes (Table 6). Essentially the same information is obtained by 
examining the number of teachers in a study and the related effect sizes 
(Table 7). Again,' the smaller number of teachers involved the larger the 
effect size. Both the number of subjects and number of teachers may have 
much to do with faithful implementation of a strategy, Treatment fidelity 
may suffer when large numbers of students and correspondingly large numbers 
of teachers are involved. 

Another point of methodological interest is seen in the data on 
duration of a study (Table 8) . Short studies produced larger impacts 
(larger mean effect sizes) than long studies. This may also relate to 
treatment fidelity wherein the control over a strategy may lessen as a 
study stretches on. 

There seems to be little to say about the effect sizes associated 
with different academic areas (Table A) or grade level of subjects (Table 2). 
A pattern in the effect sizes that begins in the grade level data 
(elementary + middle school + high school) is broken with the college 



320 



students. For the academic areas the more general topics seem to yield 
higher effect sizes than specialized subject areas such as biology, 
chemistry, or earth science. 

Critics or even advocates of instructional strategy research may 
feel that the overall Impact of the various strategies is somewhat small. 
Several points should be made, however, to show that the aggregate score 
may obscure much information about impact. Among the studies examined, the 
range of the effect sizes was nearly 6 standard deviation units. The 
largest effect size was 3.58 and the smallest was -2.10. A useful analysis 
of a particular teaching strategy might be to identify features of studies 
(either design, treatment, or context) associated with large and small 
impact. It may be possible to refine a teaching strategy by emphasizing 
features associated yith large impact and minimizing or dropping features 
associated with little effect. Subsequent research could then determine if 
the adjustments were advantageous. It is interesting to imagine how several 
strategies, none of which has an overwhelming impact, might influence 
achievement if used in concert. Consider classes in which Focusing strategies 
(effect size * -57), Questioning (effect size » -48), and Testing (effect 
size « -32) were combined by teachers. The overall influence might not be 
the simple sum of the individual contributions but a combined influence 
would be expected. This engineering of teaching strategies to optimize 
achivement would seem to have much promise. 



ERJC 



vr 



3.21 



REFERENCES 



Glass, G. Primary, secondary and meta-analysis of research. Educational 
Researcher , 1976, _5, 3-8. 

Glass, G. Integrating findings: » The meta analysis of research. Review 
of Research in Education , 1978, 5_, 351-379. 

Glass, G., McGaw, B., & Smith, M. Meta Analysis in Social 
Research. Beverly Hills, California. Sage 
Publications, 1981. 

Glass, G., & Smith, M. Meta analysis of research on the relationship of 

class-size and achievement. Evaluations and Policy Analysis , 1979, 
1, 2-16. 

Hie., N. , Hull, C, Jenkins, J., Steinbrenner, K. , & Bent, D. Statistical 
package for the social sciences . New York: McGraw-Hill, 1975. 

Rosenshine, B. Content, time, and direct instruction. In P. Peterson, & 
H. Walberg (Eds.), Research on teaching . Berkeley, CA: McCutchan, 
1979. 

Willson, V. A meta-analysis of the relationship between science achievement 
and science attitude: Kindergarten through college. A paper pre- 
sented at the National Association for Research in Science Teaching 
annual meeting, Grossinger, New York, April, 1981. 

Yeanv R. , & Miller, P. The effects of diagnostic/remedial instruction on 
science learning: A meta-analysis. Journal of R esearch in Science 
'a rhing , 1984, 1% (in press). ' 

{ 



322 



The effect of inquiry teaching and advance organizers upon 
student outcomes in science education: 
A meta-analysis of selected research studies 



G6rald W. Lott 
Institute for Research on Teaching 
Michigan State University 



323. 



TABLE OF CONTENTS 



INTRODUCTION 

METHODOLOGY 

Selection of Studies 
Indentifying and Coding Variables 
Calculating Effect Sizes 
Analysis of Effect Size Data 
Analysis and Interpretation of Data 

RESULTS OF DATA ANALYSIS 

Inductive vs. deductive 
Advance Organizers 

CONCLUSIONS AND RECOMMENDATIONS 



ERIC . 3r// 



1 


•> 


• 324 








LIST OF TABLES 




TABLE 








1 


336 






2 


345 






3 


351 






4 


353 










LIST OF FIGURES 


■ 


FIGURE 








1 


340 






/ 2 


356 






3 


362 






4 


36^ 






v — 




» 

• 


* 


ERIC 




3r, vJ 

< 





'325 



INTRODUCTION 

The purpose of the research was to determine the relationship 
between variations in the nature and structure of instructional content 
and outcome variables across the relevant experimental studies. Included 
here are the comparison of the inductive vs deductive approach (Shulman & 
Keisler, 1966), and the use of advance organizers (Ausubel , 1960, 1963). 
Of several areas dealing with the nature and structure of instructional 
content, only these areas of research provided a ^data base of sufficient 
depth to justify a meta-analysis. Among the topics considered but not 
included here are behavioral objectives, kinetic structure, mathemagenic 
behavior, curriculum scope and curriculum organization. Thus the. data 
base for this study accounts for the coding of 128 characteristics for 39 
studies selected from the 72 coded as part of the larger study. The larger 
study in turn is but one part of^a" broader project to integrate science 
education research.- 



The author would like to acknowledge the valuable suggestions 
provided by Edward L* Smith and Lee S. Shulman, Institute for Research 
on Teaching, Michigan State University, and Carol Blumberg, Department of 
Educational Studies, University of Delaware during the final revision ^of 
the coding sheet. 



326 

Various normative arguments have led to the development and 
execution of numerous empirical studies. The guiding assumption for 
the current analysis has been that these studies could provide information 
for further research through quantitative analysis of their characteristics 
Relative patterns which exist among appropriate variables of each study 
could be revealed through the utilization of Meta-Analysis. 

This process involves viewing the studies as the units on which 
measurements are taken with the variables being the coded study character- 
istics and effect sizes. The coding variables included 57 which were 
concerned with features of the treatment while 12 were concerned with 
outcome attributes. Aspects, such as methodology, sample characteristics, 
and instructional experiences were examined quantitatively in terms of 
their relationships to the treatment effects through the use of a common 
metric for all studies as defined by Glass (1978). 



327 



METHODOLOGY 

Selection of Studies 

Studies included in this analysis were selected from the one- 
hundred and fifty-one studies provided for possible inclusion by the 
Science Meta-Analysis Project. These studies were then examined to (1) 
determine their relevance to the broader research question and (2) 
ascertain the availability of the necessary data for effect size calcula- 
tion. Those studies having means and standard deviations were coded 
first, while other studies requiring more extensive calculations or 
which had minimal data reported were set aside for coding if time permitted. 
Several journal reported studies were coded based upon a limited search of 
articles in the Journal for Resparrh in Srienrre Tparhing fnr 1 Q77 - IQflH ^ 
However, major emphasis was given to the coding of dissertation studies. 
This examination resulted in a collection of 105 studies found to be 
relevant to the topic and having sufficient data for the calculation of 
effect sizes. 

Studies analyzed and reported in this report spanned the period 
from 1957 through 1980. Most of the studies were conducted during the 
1969 through 1973 period. The majority of the studies used in the meta- 
analysis reported here were dissertations with 33 studies (85%) being 
nonpublished doctoral dissertations and 6 studies (15%) being articles 
published in professional journals. All dissertations received were 
completed prior to 1977. It is estimated, based upon a review of the 
references given in those studies coded as well as a survey of reviews 
of science education research, that this analysis represents approximately 



328 



35% of the advance organizer research, and 25% of the inductive-deductive 
research. 

The systematic analysis of these studies resulted in the coding 
and calculation for 424 effect sizes. A separate effect size was calculated 
and study characteristics coded for each distinct outcome variable within 
each study. An average effect size was calculated for each instance where 
a particular outcome characteristic value would be used several times 
within a comparison for a study. 

Identifying and Coding Variables 

To make full use of statistical methods in describing and communicat- 
ing study findings and accounting for their variance, the characteristics of 
the subjects, teachers, context, design, treatment, and assessment for each 
study were expressed in quantitative terms. Some of the features and the 
nature of the assessment procedures used an ordinal scale while others 
involved a nominal scale based upon indicator variables. 

The development of the coding form involved the preliminary coding 
of several articles. In addition, a survey of several reviews of science 
educatun research as well as learning and cognition reviews was made to 
ascertain the important characteristics to be coded. This resulted in the 
adoption of several classifications based upon categories proposed in the 
literature. The following provides a brief overview the characteristics 
which were coded. A description of the conventions used for many of the 
coding sheet items as well as the coding sheet is provided in nftmtersna^ 

The study identification variables were used to distinguish studies 
as well as multiple effect size codings within single studies. The study 

ERLC 



329 



code identified each individual study. Each comparison within a study was 
given a code while within each comparison a separate code was given. Thus, 
those studies which compared more than one treatment, varied the sample 
characteristics, or used more than one outcome measure for any comparison 
were given distinct codes. 

The student characteristics variables were intended to delineate 
various important features of the samples used. Teacher characteristics 
variables provided background information concerning the individuals 
presenting the instructional treatment. Characteristics of the context 
were intended to provide necessary information concerning the environment 
in which the study was conducted. 

Coding for design characteristics included the methods for assign- 
ing students and teachers to the treatments as well as the unit of analysis 
and experimental design used. The coding for the internal validity of 
each study followed the convention that intact and highly dissimilar samples 
based upon ability or socio-economic level, were classified as having low 
internal validity. Those found to have intact or randomly selected 
classrooms with similar characteristics were coded as medium, while those 
studies which involved complete random selection of subjects and had low 
mortality were coded as high. The coding for the type of study followed 
the sys,tem proposed by Campbell and Stanley (1963). 

Treatment characteristics included such aspects as preinstructional 
strategies, the inquiry orientation of instructional tasks, the character- 
istics of the learning tasks as well as the content, the type of instruc- 
tional techniques. The preinstructional stragegies included the coding 
for the, type of advance organizer used. The distinctions used for the 



330 



coding of the level of inquiry were based upon those suggested by Shulman 
and Tamir (1973). The coding for the characteristics of the learning tasks 
involved items concerned with the kinds of activities (Johnson, Rhodes, and 
Rumery, 1974). Categories for the structure of content were based upon 
those suggested by Haggis and Adey (1979) while those used for the 
characteristics of the questions asked as part of instructional tasks were 
based upon those proposed by Bloom (1956) for the level cognitive reasoning 
and by Johnson, Rhodes, and Rumery (1974) for the level of generality. 

The coding for the outcome characteristics included categories such 
as the intent of the assessment, the domain orientation, the type and method 
of measurement, the reactivity, and the reliability. The convention for 
coding the intent of the assessment was based upon the novelty of the context 
(Johnson, Rhodes, and Rumery, 1974); i.e. whether it was to assess the 
acquisition of knowledge involving identical or similar aspects, or whether 
it was to assess transfer to related or new situations. The domain of 
orientation distinguished between cognitive, affective, and behavioral 
forms of assessment. The convention for reactivity involved the specifica- 
tion of whether judgments were objective -or subjective. 

In general , most studies were very limited in the description of 
study characteristics and thus several of the variables were never or 
seldom used. As a result nearly 25% of the variables were eliminated early 
in the data analysis process including almost half of the student character- 
istic variables and nearly all of the teacher characteristic variables. In 
addition, thirty percent of the treatment characteristic variables were 
eliminated due to limited usage. 



331 




Calculating Effect SizesV 

The calculation of effect sizes for this study utiltized where 
possible the definition proposed by Glass (1978) . Generally, if post- 
test means and standard deviations were provivded, this procedure was used 
In other cases the appropriate approaches as presented by Glass, McGaw and 

4 

Smith (1981 were utilized. 



Analysis of Effect Size Data 

The approach utilized in this study for effect size data analysis 
was within the exploratory data analysis paradigm. The delineation of 
the approach to the analysis of effect size data will consist of an elucida- 
tion of the exploratory nature of the analysis, and the statistical approaches 
and sequence use. 

It is argued that the data analysis procedures following data 
acquisition should be exploratory. The use of these procedures prior to 
the further appl ication of inferential statistical methods can provide the 
data base for the formulation of "conjectures 11 and the resultant design of 
"experimental arrangements. " The concept of conjecture is used in the 
sense proposed by Popper (1 962) while experimental arrangement is a concept 
elaborated by Hanna (1966) which refers to the nature of the treatment 
variables. I am, however, using it in a broader sense to include the pre- 
sage and process variables. 

The intent is to formulate questions for further research and 
provide direction for research programs, Tukey (1980) has indicated that 
research questions should be formulated only after extensive exploration 



*~ 332 



of the data base. Exploratory data analysis potentially can provide 
insights regarding features of the experimental arrangements. These 
discernments can clarify interrelationships and give direction to the 
science education research effort. 

Leinhardt and Leinhardt (1980) suggest that exploratory data 
analysis is "an approach that illuminates rather than obscures the 
analysis of data and makes apparent rather than disguises analytic results 
(p. 85). They later point out that: 

* 

The philosophy is one in which the analyst's first task is 
viewed as discovery of evidence, not evaluation, and 
consequently the tools are designed to reveal unforeseen 
features rather Chan create a decision-analytic framework 
for judging the importance of expected features' 1 (p. 149). 

These quantitative techniques give direction to future research through 
descriptive analysis rather than providing, a basis for confirmatory infer- 
ences. Thus, exploratory data analysis can provide "descriptive power" 
(Hanna, 1969), the essential characteristic of which is that the 
descriptive power of models which reflect additional information trans- 
mitted by the data. It is within this framework that exploratory data 
analysis can assist in developing further experimental arrangements with 
explanatory power. 

The intent of the procedures described below is to expose the 
relationships between study characteristics and effect sizes using 
descriptive data. The statistical analysis began with the use of SPSS 
FREQUENCIES. The next step involved the use of SPSS CROSSTABS to organize 
the study characteristics data systematically in contingency tables for 
further analysis. 

Since the principal task of this study is exploratory rather than 
- confirmatory, the findings will be summarized using exploratory data 



333 

analysis methodology (Tukey, 1977). The data displays were box~and-wisker 
plots, a technique used to display batches of data. The median as well as 
the upper and lower quartiles are calculated and the quartiles used to 
define, on a vertical axis in this study, the boundaries of a narrow rec- 
tangle. The length of the box is used to define "inner fences," while 1.5 
times this distance defines "outer fences." These conventions are a 
modification of those proposed by Tukey (1977) and are based upon those 
used by McNeil (1 977). 

Analysis and Interpretation of Data 

The guiding interest was determining the relationship between 
effect sizes and study characteristics. The intent was to analyze the 
population of effect sizes across characteristics within each of two 
research areas: Inductive vs Deductive, and Advance Organizers. The 
principal goal of this analysis was to determine the relationship between 
effect sizes and study characteristics within these selected research 
areas through a comparison between effect sizes across the levels of each 
descriptive variable for each of the two defined research variables. This 
approach includes the review of correlation coefficients, the examination 
of study design characteristics in relation to the effect size, and treat- 
ment characteristics in relation to effect size. All of these calculations 
were done using microcomputer programs based on those written by McNeil 
(1 977). 

Due to the limited number of studies for each of the two defined 
research variables a "dependent measure" approach was used in the analysis 
where each of the 424 effect sizes were treated statistically as an independ 
ent data point. Thus, any dependence of the data due to those studies 



334 



yielding more than one effect size.due to multiple but distinct outcome 
instruments oi factorial designs providing multiple comparisons is not 
accounted for. However, a sample of study characteristics was selected 
and the median and hinges were calculated and found to differ by only ,07 
effect size for the "dependence approach." 

The following sections explore the relationships between each study 
characteristic variable and the effect sizes for each research variable. 
This analysis will be limited to those cases where the crosstabulated 
cells for the research variable by study characteristic in question have a 
sufficient number of cases. To justify any discussion intended to lead to 
useful recommendations and conclusions, it was deemed necessary to have 10 
or more effect sizes and 4 or more studies. 



335 

RESULTS OF DATA ANALYSIS 

Inductive vs Deductive 

This research variable, defined by the crosstabul ation of those 
items specifying an inductive or deductive approach for the experimental 
and fcontrol group, can be characterized by the learning activities sequence. 
Educational experiences in which examples or observations were provided to 
students prior to formalizing generalizations were classified as inductive. 
Those studies where generalisations were formulated prior to any illustrative 
examples were characterized as deductive. The analysis of the data base 
found 212 effect sizes from 24 studies where the treatment or control group 
was coded as inductive or deductive. 

Tabl£ 1 shows the mean effect size .for inductive vs deductive 
teaching on several outcome measures. Effect sizes in favor of the induct- 
ive approach are labelled positive and those in favor of the deductive 
approach are designated negative. The composite of these several outcome 
measures has an effect size of .06. In the aggregate, there is essentially 
no difference between the two approaches. 

A summary analysis of this data base across all study characteris- 
tics resulted in finding a median effect size of .02 and hinges of .33 and 
-.22. Effect sizes in favor of the inductive approach are designated 
positive and the deductive negative. Thus, the average student experiencing 
an inductive instructional approach did better than only 51 % of the control 
group. It must* be remembered, however, that approx irately 60% of the 
studies used a level of inquiry only slightly different from the deduct jve " 
approach. In addition, the spread between hinges indicates that 75% of 
those have an inductive approach did better than 42% of the control group. 



336 



Table 1 



on Several Outcome Measures 



\ 



Knowl edge 

Application 
Process 

Problem Solving 
Composite 



.02 
-.10 
.29 
-.01 

• .06 



s 

.67 
-.36 
1.57 
.57 
.87 



n 

IT 
4 
8 
8 
38 



\ 



337 

The following narrative will describe any differences in effect size 

based on particular study characteristics. This description will provide 

insights as to circumstances where an inductive approach could be expected 

f 

to provide a more effective educational approach. 

Several variables were not analyzed due to the lack of variation 
in study characteristics across studies within this research topic. These 
were features with insufficient data for comparative analysis but which 
provided added detail concerning the mature of treatments within this 
research category. All studies which described the grouping patterns did 
not have grouped subjects. The scope of content for the majority of effect 
sizes coded was disciplinary. Within this framework the organization of 
content was generally concept-oriented with some treatment comparisons 
using an organizational scheme involving topics. While there was somewhat 
more variation in the features concerning manipulative level, most of the 
treatments were characterized as harving individual manipulation witli 
•objects^ ♦ However, nearly half of the treatment comparisons utilized picture 
study. The majority of studies did not fully describe the mode of communicat- 
ing knowledge. Those who did generally used the laboratory although nearly 
one-third used demonstration, fifty-five percent of the outcome comparisons 
were concerned with the acquisition of krowledge by subjects. Of thefee 
studies 66% were characterized as assessing student performance on knowledge 
similarJ:o that used in the instruction. 

The correlation of data within this research topic resulted in some 
statistically significant correlations (where p<.01) Those with r^.49 
and having, important, methodological and educational implications are shown 
in Table 2. 

i 

EMC 37 jl 



338 



The following findings are evident in Table 2. 

1. Those studies with samples having higher IQ ability had a mere 
heterogeneous make-up. 

2. Those studies conducted in a suburban environment had subjects 
from higher socio-economic status than those studies conducted 
in an urban area. 

3. Those studies whose assignment of subjects was less experimental 
generally involved a longer duration. Experimental studies were 
usually of shorter duration. On the other hand, those studies 
using random selection of subjects generally involved more 
sessions. 

4. Those studies classified a a inductive were generally conducted in 

9 

an urban environment and involved quasi-experimental designs with 
low. internal validity. 

5. Studies which utilized designs vyith highe internal validity in 
experimental framework generally involved the use of a more 
structured learning environment. 

6. The studies with guided exploration utilized fewer sessions than 
those which were structured. 

7. As might be expected, the deductive approaches were more 
structured 'than the inductive-oriented learning experiences. 

8. The inductive approaches utilized a higher level of inquiry as 
opposed to those which were deductive. . 

9. Studies with higher-level inquiry iwolved less restrict! ve^N^ 
. environments and greater access to manipulative activities than 

those studies having lower levels of inquiry. 



339 



10. 



Approaches in which subjects made judgments or organized 



elements into new patterns were inductive-oriented with a 
higher level of ^inquiry than those which required subjects 
to simply retrieve information. 
11. Those studies aimed at developing or assessing concepts 

generally utilized the biological sciences for their content 
orientation. ^ 

The data analysis resulted in the observation of several differ- 
ences which are important for future methodological decisions. These 
results are shown as graphical box-and-wisker plots in Figure 1, where 
fences are used to identify the outside and far out values of the effect 
sizes for each comparison. These comparisons were made to discern if the 
data required any separate analysis for those studies with a stronger 
design as a result of being noticeably different from the complete data 
base. 



Figure 1 





FIGURE 1 

C 



BOX-AND-WHISKER PLOTS FOR DESIGN CHARACTERISTICS, OUTCOME 340 
CHARACTERISTICS, AND SOURCE OF EFFECT SIZE 



io. o; 

8.0. 
6.0* • 



4.0- ■ 

3.0- • 

2.0- ■ 

1.5-- 

1 .0- 

.5-- 

0 - 

- .5-- 

-1.0- 

•1.5- ■ 

-2.0*^ 
-4.0" ' 



jpJer quartile 

MEDIAN 

fR QUARTILE 
ES 
Ses 

I 0"(E.S.) 
IO 3ERJC TUDIF.S 



t 



• 



s 



si 



m 



S3 

m 
70 



C5 



1 



CO 

r- 

o 
o 

7^ 



O 
— I 

o 

TO 



O 
O 

1—4 

o 
m 



ASS FGNBERT 


TYPE OF 


EXPERIMENTAL 




OF SUBJECTS 


STUDY 




DESIGN 




.31 .33 


.34 


.30 


1.50 


.29 


.29 


-.07 .04 


.12 


-.05 


.60 


-.01 


.01 


-.30 -.09 


-.07 


-.29 


-.02 


-.30 


.19 


•24 .06 


.11 


.23 


1.22 


.16 


.01 


1.38 .41 


.39 


1.33 


1.86 


1.20 


.40 


1 32 58 


90 


122 


18 


119 


72 


12 9 


12 


12 


6 


9 


8 



3 



1 -± 



I 

I 



1 
I 
i 
I 
I 
I 
I 
I 
I 



I 
I 
I 



io. o; 

8.0. 
6.0. 



4.0-. 



t 



3.0 
2.0- 
1.5. 
1.0 

.5. 

0 

- .5« 
-1.0 
-1.5 



-2/0 
-4.0+ 



JP§R QUART I LE 
MEDIAN, 
•_0tfR QUART I LE 
ES 
Ses 

vt(kKjCsTUDIES 



341 

CONTINUED 



• 



«2 



•2 
• 



* 



• 





L 


X 







r 



a 

oo 



go 



a 

CO 



31 



O 
l 

o 



^ 00 <JD 

O O O 

I I I 

\J 00 

<X3 , cn 



TYPE OF 
MEASUREMENT 



REACTIVITY 



RELIABILITY 



.37 


.29 


.27 


.49 


.57 


-24 


.33 


.06 


.04 


.13 


-.08 


.-.02 


.10 


.38 - 


.07 


.11 


-.08 


-.08 


-.02 


-.36 


-.25 


-.11 


.16 


-.23- 


-.14 


-.49 


-.24 


.39 


.09 


.10 


.44 


.35 


.02 


.08 


-.15 


-.08 


1.09 


1.12 


.91 


.97 


.27 ' 


.46 


.39 


.38 


.28 


68 


144 


178 


18 


15 ' 


, 31 


67 


32 


26 


15 


17. 


21 


4 , 


4 


6 


12 


9 


5 



CONTINUED 



I 

! 

s 
I 
I 
I 

I 
1 
I 
i 
I 
I 



I 
I 

jfIe 

f 



10.0; 

8.0. 
6.0 J. 
4.O.. 
3.0- - 
2.0.. 
1.5-. 
1.0- - 



0 



- .5-- 

-1.0-- 

-1.5-- 

-2.0" 
-4.01 



R QUARTILE 
MEDIAN 
R QUARTILE 

* ET 
Ses 

?^(E.S.) 
TUDIES 



•2 



e 

•2 



— F 







< -n 


— 1 


m nm 


o 2: 






< 7> 3=» 


CO > 




< 


■-h -z. 


— I o 


c: 3 ■ 




>ooo 


1 o 


m cl 


r* 




— I c: 


co 


c: 




m co 


— ^ 


m 


o o 


CO — I 






2 O 


— 1 m 
o 







EFFECT SIZE SOURCE 



.29 


.29 


.34 


.53 


-.01 


-.02 


-.02 


-.11 


-.21 


-.23 


-.41 


-.41 


.11 


.11 


.31 


.33 


.72 • 


.76 


1.56 


1.66 


135 


118 


61 


33 


14 


13 


9 


5 



« ) -• • 

■ O I o 



343 



P 



There was little difference in resultant effect size between those 
studies with random assignment of subjects and those which utilized intact 
groups. Those with intact groups did, however, have less variation in 
values. In addition, little difference was found when comparing the quasi- 
experimental and experimental studies. Differences were observed between 
those studies using a simple blocking design and those using a factorial 
or covariant design which may suggest that different conclusions may be 
drawn depending upon the experimental design. In this case, however, further 
analysis indicated that there was insufficient data for a separate analysis. 

A comparison between the effect sizes for those studies which 
utilized nationally published as opposed to ad hoc published outcome 
measures showed little difference. Differences were detected, however, 
between those studies using highly reactive measures and those. having 
moderate to low reactivity. A separate analysis was conducted for those 
studies having a low outcome reactivity even though there were notably 
fewer studies in the medium and high categories leading to the possibility 
that they may not have of themselves influenced the results. In addition, 
the effect sizes for those studies using an outcome measurement with a 
reliability ^ - 79 were dissimilar from those of studies having instrumenta- 
tion with a reliability of^, .80. Insufficient data, however, was avail- 
able for an analysis with studies having outcome rel iabil ities> .80;- 
Finally , little variation was found in the effect sizes across studies 
depending upon how the effect size was calculated. 

The frequencies analysis and crosstabulation of study characteristics 
resulted in the selection of 30 variables which provided adequate data for 
comparative analysis. The results of tfrk analysis, with 21 variables 




ERIC 



344 

meeting the criteria for inclusion, are shown in Figure /f. If different 
results were obtained by the selection of comparisons having low reactivity 
only the results from the low reactive comparison are provided. 

Outcome measures were categorized along the following three 
dimensions; (1) the intent of the assessment acquisition, i.e. the use of 
identical information as compared to dissimilar information, (2) the intent 
of assessment transfer, and (3) the assessment domain of orientation, i.e., 
knowledge, application, process, or problem-solving. 



TABLE 2 

CORRELATION COEFICIENTS FOR STUDY CHARACTERISTICS OF. 

INDUCTIVE VS. DEDUCTIVE RESEARCH 

SC03 SC16 CC02 TDOl TD02 EX06 EX07 EX11 EX30 EX32 EX33 
SC02 .86 

SCIO .56 .65 .63 

SC15 .97 

CC02 .57 .59 .51 

DCOl .59 -.53 -.54 

DC03 .61 -.55 

DC05 -.64 .53 .53 -.53 ^ 

TD02 .65 -.68 

EX06 --87 -.68 -.53 

EX07 -72 

EX08 -65 

EX11 -83 

EX13 --58 

EX31 ' -80 



META - ANALYSIS DATA FILE VARIABLES AND DEFINITIONS 

SC02 - Ability level (IQ) 

SC03 - Homogenity of IQ: (1) Homogeneous (2) Heterogeneous 

SCIO - SES: (1) Low (2) Low Medium (3) Medium (4) Medium and High (3) High 

SC15 - Clacs size (no. of students): Experimental 

SC16 - Class size (no. of students): Control 

CC02 - Community type: (1) Urban (2) Rural (3) Suburban (4) Mixed 

0CO1 - Assignment of Subject Treatments: (I) Random (2) Matched (3) Intact 
Groups (4) Self Select 

--0CO3 - Rated Internal Validity (see conventions): (1) Low (2) Medium (3) High 

DC05 - Type of Study (1) Correlational (2) Quasi -Experimental (Descriptive) 
(3) Experiments 1 (4) Pre -Experimental (One Group P re-Post) 

TOO I - Number of weeks 

T0O2 - Number of sessions 

£X06 - Inductive vs Deductive: (1) Inductive (Discovery) (?) Oeductive (Expository) 

£X07 - Guidance- (1) Structured (2) Free exploration (3) Guided Exploration 

IXOH - L<>vel of Access: (1) Remore demonstration (2) Individual manipulation 

EX1! - Levels of Inquiry (see Shulman X Tamir, 1973): (I) None (2) low 
( i) Medium (4) High 

EX 14 - Mathemagenic Behaviors (see Rothkopf, 1970): (1) Used (2) Translation 
(J) Segmentation (4) Process^') 

£X30 - Degree of Generality: (1) Items U) Categories (3) Systematic Patterns 

Ex31 - Type: (1) Progressive Differentiation (2) Developmental Level of 

Coqnitive Functioning (3) Hierarchical (4) Random (5) Learning cycle 
(i.e.. SCIS) 

EX32 - Sequencing Unit. v l) Single lesson (2) Instructional unit (3) Instructional 
Term (4) Instructional Program 

FX33 ■ Content-orientation (see Klopfer, 1971): (I) General Science 

(2) Biological Sciences - 10-24 (3) Chemistry - 26-35 (4) Physics - 
41-48 (5) tarth Sciences - 56-60 (6) Biochemistry 



346 

• 

There is an indication that the inductive approach does not help 
students when the evaluation criteria is the acquisition of identical 
information. For this criteria the students taught with an inductive 
approach were -.22 of a standard deviation from the average member of the 
control group. When the evaluation instrumentation called for the demon- 
stration of a capability with similar concepts the inductive group was 
only ,05 of a standard deviation from the control group. An examination 
of the remaining outcome characteristics shows that there was little 
difference between the inductive and deductive groups for transfer, 
comprehension, or application of concepts, as well as process skills and 
problem solving. 

Intermediate students seemed to perform better within an inductive- 
oriented setting with a .18 btandard deviation difference . The average 
intermediate student would do better than 57% of those within a deductive 
approach. Moreover, 75% of those taught with the mductive approach 
would perform better than 52% of those in the deductive group. There was 
little difference at the junior high level and the average subject in high 
school who was exposed to this approach actually performed not better than 
44% of those in the deductive-oriented approach. 

There seems to be little differences in approach depending upon 
ability level, homogenity of ability or gender. Those having an IQ of 93 - 
107 performed as well as 47% of those in the deductive group. Subjects in 
heterogeneous groups accomplished nearly as much as those in the control 
group while those in homogeneous groups performed as well as 42% of those 
in the deductive groun. Moreover, distinctions based upon seriation ability 
indicate little difference in performance. There are differences, however, 
with respect to class size. Those in classes of 17-26 performed better 



347 

when experiencing an inductive approach, with the average subject perform- 
ing better than 56% of those in the deductive group* As class size 
increased performance in comparison to the deductive group decreased. 

The community context in which studies were conducted had little 
relationship to student performance. Variation in context never resulted 
in more than a difference of .08 standard deviations. The duration of 
the study also seemed to have little effect upon the accomplishments of 
the inductive group. In each time span the average subject in the treat- 
ment group was within .1 standard deviation of those in the deductive group. 

Variations in a number of the treatment characteristics seemed to 
make little difference in the performance of the inductive group compared 
to those having a deductive approach. These characteristics included the 
content-orientation, whether the materials included text and manipulative 
or manipulative only, or whether teacher interaction was direct or indirect. 
In addition, it did not seem to make a difference whether the learning 
experiences were intended to develop an understanding of categories or 
systematic patterns. 

The level of inquiry (Shulman and Tamir, 1973) also did not seen; 
to affect the performance. It should be noted, however, that more than 
60 A of the studies involved students in a medium level of inquiry. The 
potential for a multivariate analysis was explored using the level of 
inquiry of the learning experiences treatment characteristic as an independent 
variable. It was not pursued, however, due to the limited number of studies 
in which the level of inquiry was codable. 

Several other characteristics seem to affect the results including 
the level of guidance, and the kind of activities. Subjects experiencing 



348 



inductive learning through guided exploration performed .2 of a standard 
deviation better than those with an inductive approach, having a more 
structured environment. While the Average student in the treatment group 
performed better than 54% of those in the deductive-oriented group, those 
in a more structured atmosphere could only perform on the average better 
than 46% of those in the control group. Where the intent of the learning 
experiences was to have students work with categories and organize knowledge 
into new patterns they performed on the average better than 52% of the 
control group, whereas those who were required to make distinctions 
performed only better than 49% of those having a deductive approach. In 
fact, the results indicate that 75% of those in the inductive approach did 
better than 48% of those in the deductive-oriented group. 

Variations in several other characteristics also seem to affect the 
results. These features include the instructional sequencing used and the 
mode of communicating knowledge. Subjects experiencing an inductive curri- 
culum organized with the progressive differentiation of concepts pp, formed 
.19 standard deviation better than those having a hierarchical inductive 
curriculum; this was actually .24 standard deviation when including outcome 
measurements having medium and high reactivity. Moreover, where the frame- 
work for the sequencing of concepts was the instructional program, students 
performed better than when the sequencing unit was the instructional unit. 
Seventy-five percent of those using a program sequence performed better than 
SOX of the deductive group and better than nearly 75% of those having the 
unit as the sequencing feature, Where the emphasized mode of communi eating 
knowledge was through discussion the average subject performed better than 
56i of the deductive group, while there was no difference between inductive 



349 

J 

and deductive when both discussion and lecture were utilized; 

rv 

Impl i cations . The results of this analysis comparing the 
inductive and deductive approaches provide a framework for conjectures 
as well as directions for future research. Conjectures which seem 
justified include the apparent positive effect the inductive approach 
has at the intermediate level. Moreover, this approach seems to be more 
useful in those situations where high levels of thought, learning experi- 
ences, and outcome demands are placed upon the subjects. In addition, 
the inductive approach appears to function better when the curri'ular 
organization is formulated across units to involve the complete program. 

It was realized early in the analysis that more research concern- 
ing the level of inquiry needs to be conducted. Many studies indicate 
their concern with inquiry but few address the level of inquiry involved. 
The difference found between the effect of inquiry experiences upon 
comprehension and process skills outcome needs to be further explored. 
This might include the collection of qualitative data concerning treat- 
ments in an effort to explore their nature and characteristics. This 
should provide an insight into the difference in effect. 

Several suggestions for future research can be ascertained from 
the results of this meta-analysis. It would be useful to conduct studies 
having a range of inquiry levels and utilizing variations in characteristics 
of manipulative involvement, curricular organization, and approach to the 
communica tion of knowledge. 

Those conducting research in this area should consider the reactivity 
and reliability of outcome instruments. Where more reactive instruments 
must be used, researchers should increase the collection and description of 



Q3UMITM00 



1± 



H I 



+ 



cr ro 


cr 


<c 


7^ 


a: o 


a: 


cr 


s: 


o — i 


o 


cr 


o 


CO > 


o 


—l 




i » 


rn 


»— i 


^ ~i 


m s: 


OJ 


O 


Om 


SI o 


OJ 


'< < 


o 






1— 


CD 






»— i 


m 






o 













HIAMOa TH3M22322A 
H0ITATH3IA0 30 



ERIC 



; r» P 



ae. 


n. 


81. 


ss. 


£0. 


so.- 


SI.- 


90. 




*£.- 


5£.- 




10.- 


es. 


01.- 


SO. 


\d. 




B£. 


\b. 




8S 


ss 


£8 


8 


8 




81 



3JITAAUP AMU 
HAI03M 
3JITAAUP flj 
23 

292 

(.2.3)3 ' < 
23IQUT2 to WOV. 



Od£ 
Q3UHITH03 



• 

• 



• 



o.or 

4.0.8 

0. 3 

4*o. £ 

.o.e 

J.0.S 

1. r 
JD.r 
-.3. 



IT 



0 



.j2. - 

..o.r- 
-.3. r- 



m 

m 
o 



ro 





5 > 






o ' 
m 


<C m 


m cr 




s: oo 


X ZD 


oo — l 


s: 




1— -1 


o m 




cr 


< 


=d r> 


< 


=3 




00 H- 


o 


—| 


00 =D 


<c 


< 


s: > 




—1 




<C m 


o rn 






s: oj 


s: 



00 

o 



00 
00 



o 

m 
o 



m 

o 



10 TM3TMI 


30 TH3THI 


20IT2IH3TDAHAHO 


30 


30 OM 


H3H0A3T 


TH3M22322A . 


TH3M22322A 


2JAIH3TAM 30 


9HITA0IHUMM00 


H0IT0Afl3THI 


H332HAflT 


H0ITI2IUQA 






33O3JW0fl>t 




U. 


es. 


02. 


^3. 


3K 


£S. 


£K 


G£. IS. 


SO.- 


20. 


SS.- ' 


. S£. 


£S. 


00. 


M. 


SO. 30.- 


8£.- 


OS.- 


83.- 


31. 


30.- 




10.- 


U.- 8K- 


II.- 


IS. 


eo. 


M. 


^3. 


M).- 


1E. 


*S. 31.- 


8K 


w.r 


ix. r 




8Kf 


SK 


M. 


00. r £3. 




8^ 


as 


£S 


es 


3£ 


G£ 


3e £3 




\l 


8 


3 


3 






£1 3 


ERjC 























3JITHAUP fl 
MAI03M 
3J I TflAUP fl 
23 

292 

(.2.3)3 

23I0UT2 ^o|0! 



!|0. 



350 ' 

♦ 

qualitative data. The increased use of qualitative data may be helpful 
in further exploring the difference in effect sizes between instructional 
sequences based upon progressive differentiation and those with hierarchical 
sequences* Data collec'on expanded to include qualitative data is more 
practical through the use of probit transformations in the calculation of 
effect sizes, . > In addition/ any increase In qualitative description 
would assist in the coding of study characteristics. 

Experimental research in this area should include iess structured 
learning environments for the inductive approach as well as more rural and 
suburban contexts for the studies. Future quasi-experimental studies need 
better documentation of research features as well as more deductive treat- 
ments. 

Advance Qrqanizers 

Advance organizers (Ausubel , I960, 1963) were proposed to improve 
"meaningful verbal learning" through the association of what is to be 
learned with the learner's current conceptual framework. The data base 
on this topic included 147 effect sizes from 16 studies where the treatment 
or control group was coded as using an advance organizer. The data base, 
most of which was not included in the meta-analysis reported by Kozlow and 
White (1980), is limited mainly to dissertations and science education. 
Due to the limited data base, a multivariate analysis was not possible. 

Table 3 shows the mean effect sizfe for advance organizers vs a 
control group on several outcome measures. Effect sizes in favor of the 
advance organizer group are labelled positive. The composite of the two 
outcome categories has an average effect size of .24. In the aggregate, 
advance organizers have an advantage of about one quarter of a standard 
deviation over a control. 



351 



v Numerous variables were not analyzed due to the lack of variation 
in study characteristics across studies within this research topic. These 
features, however, did provide added detail concerning the nature of , 
treatments. There was very little variation in the features of advance 
organizer research across studies. Most experimental arrangements did not 



TABLE 3 

Mean Effect Sizes for Advance Organizers vs. Control on Two 
Outcome Measures and a Composite 









S 


n 


Knowledge 




.09 


.59 


17 


Application 




.77* 


.47 


5 


Composite 




.24 


.63 


22 



ERIC 



352 



use any grouping, were disciplinary in scope, and were interested in 



student comprehension as an outcome* Only 11,6% of the treatment 
comparisons used c appl ication as an outcome variable* Little variation 
was found in the type of measurement with a sizable 93*2% of the treatment 
comparisons using an ad hoc published instrument for outcome measurement. 
The method of measurement was also seldom different with 88.4% being 
multiple choice. While there was more variation found in the content- 
orientation of advance. organizer research \t seemed apparent that the 
physical sciences were the most popular^ ^t^5. 3%. 



A correlation analysis of the variables withTrTiiMsj^ topic 
resulted in some statistically significant correlations (where p <roi). 
Those with rz .49 and having important methodological and educational 
implications are reported in Table 4. 




Table 4 






353 






TABLE 4 






; CORRELATION- COEFICIENTS FOR STUDY CHARACTERISTICS OF 




1 


' ADVANCE ORGANIZER RESEARCH 




1 


ID05 SC01SC05 SC14 SC15 CC02 DCOl TDOl TD02 EX25 






SC16 .98 




1 


CC02 .52 




[ 


DC02 -.58 -.71 




1 


DC03 -.53 




} 


DC05 t -.65 -.88 






DC06 -.64 






TD02 -.66 . -.56 






EX25 -.64 .58 .68 -.68 






EX54 .69 .64 -.66 .64 

/ 






META-ANALYSIS DATA FILE VARIABLES AND DEFINITIONS 




ID05 


J 

- Year of study 




SC01 


- Modal grade ( 




SC05 


- Gender [% female) 




SC14 


- Special grouping: fli) Not grouped (2) Low track (3) Medium track 
(4) High track (5) Voluntary 




SC15 


- Class size (no. of students): Experimental 




SC16 


- Class' size (no. of students): Control 




CC02 


- Community type: (1) Urban (2) Rural (3) Suburban (4) Mixed 




DCOl 


- Assignment of Ss to Treatments: (1) Random (2) Matched (3) Intact 
groups (4) Self-select 




DC02 


- Assignment of teachers to treatments: (1) Random (2) Non-random 
(3) Self-Select (4) Crossed (5) Matched (6)' Investigator v ' - 




DC03 


- Rated Internal Validity (see conventions): (1) Low (2) Medium 
(3) High ; 




DC05 


- Type of study: (1) Correlational (2) Quasi-Experimental (Descriptive) 
(3) Experimental (4) Pre-Experimental (One Group Pre/Post) 




DC06 


- Experimental Design: (1) Blocking (10) Factorial (30) Covariance 
(31) Covariance Blocking (32) Covariance Factorial (33) Covariance 
Blocking & Factorial 




TDOl 


- Number of weeks 




TD02 


- Number of sessions 




. EX25 


- Scope of Content: (1) Disciplinary (2) Integrated (3) Multi- 
Disciplinary (4) Interdisciplinary 




EX54 


- Text: (1) Text only (2) Text and manipulatives (3) Manipulati ves only 




ERIC 


• 

- ' 387 





■a 

354 . 

The following findings are evident in Table 4. 

1. The grouping of students was generally practiced more in the 
suburban environment than the urban or rural. 

2. The more recent studies used fewer sessions than earlier studies. 

3. In relation to design characteristics those studies having random 
assignment of subjects did not also randomly assign teachers. If 
teachers were randomly assigned the subjects were generally from 
intact classes. 

4. Those studies with female participants generally had low internal 
validity while those with male participants were generally higher. 

5. Experimental studies were usually conducted in the urban and rural 
environment. 

6. Studies which utilized designs intended to control for confounding 
variables were usually conducted at lower grade levels than those 
which used' simpler designs. 

7. Recent studies were more inclined to utilize a more multi- or 
inter-disciplinary approach than earlier studies. 

8. It was not surprising that those studies conducted more recently 
involved the use of manipul ati ves more than earlier studies. 

9. Sample sizes for the treatment groups and the con trip 1 groups were 
nearly equal . 

10. Manipulatives were generally included as an aspect of studies in 
suburban environments whereas treatments were more textual -oriented 
in suburban settings. 

11. In relation to the duration of the study those using a multi- or 
inter-disciplinary organization tended to be longer in duration, 
however, there were fewer sessions. A similar inverse relationship 

ERIC t 3Hd 



355" 



existed in relation to the characteristic of instructional 
materials. While those involving the use of manipulati ves 
tended to be longer duration studies they used fewer sessions 
per study. 

An examination of design characteristics relative to study 

validity indicated several differences which are important for future 

2- 

methodological decisions. These results are provided in Figure As 
the previous research topic these comparisons were made to discern if a 

separate analysis was required for those studies with a stronger design 

* 

Figure 



i 



FIGURE 2 



BOX-AND-WHISKER PLOTS FOR STUDENT, CONTEXT, TREATMENT, AND 
OUTCOME CHARACTERISTICS 



356 



10.0 
8.01 

6.0- . 

4.0— 

3.0-- 

2.0. . 

1.5.. 

1.0.. 



0 

- .5.. 

-1.0-- 

-1.5- • 

-2.CT 
-4.0- 



—J 



-2_F 



4* 
I 



O 

r 





i— » 


in 


m in 


CO 


ro 


o 


om 




i— » 




c: — i 


»— » 






oo m 




i— » 


cn 




en 


ro 


m 


o 






2: 


CD 






m 


m 






o 








c: 


i 











r— 
m 



3=» 
r - 
rn 



jAr quartile 

MEDIAN 
-0JR QUARTILE 

Ses 

\icERJCtudies 



GRADE LEVEL 
(SELECTED FOR 



ABILITY LEVEL: IQ 
(SELECTED FOR LOW 



H0M0GENITY OF IQ 
(SELECTED FOR 



GENDER 



LOW 


REACTIVITY) 


REACTIVITY 


LOW RFACTTVTTY \ LOW REACTIVITY 


.31 


^.24 


.13 


.15 


.27 


.24 


.16 


.16 


.32 


.18 


.02 


-.15 


-.05 


-.21 


-.19 


-.02 


.00 


-.21 


.03 


-.22 


-.41 


-.27 


-.63 


-.63 


-.19 


-.35 


-.56 


.24 


-.03 


.14 


.09 


-.19 


-.22 


.20 


-.14 


-.12 - 


.37 


.42 


1.52 


1.13 


.56 


.58 


1.24 


.48 


.50 


16 


1.01 


34 


63 


19 


34 


48 


27 


27 


. 6 


7 


6 


7 


4 3d (J 4 


7 


5 


5 



357 

CONTINUED 



10.0 
8.0. . 



6.0- - 
4.0. . 
3.0. . 
2.0- . 

1 • • 

1.0.. 
.5+ 

0 



- .5.- 

-1.0.. 

-1.5.. 

-2.CT" 
-4.0- 



92 



: 2 

-* — F 



_i_F 



• • • 



•2 



JF • F 



— F 



X 



UPPER QUARTILE 
MEDIAN 
LCHR QUARTILE 
■ It 

9 Ses 

N(§SJ£ STUDIES 



i 



i 

ro 
o 



ro 



00 

o 

m 
o 



m £73 
x c= 
-o >— 
r- o 
o m 

TO C3 



o 



O 
i — i 

cr 

2 



CO 



o 



o 

CO 



o 

T3 



STUDY DURATION 
(NO. OF WEEKS) 



GUIDANCE 



LEV JL OF KINDS OF ACTIVITIES 
INQUIRY (SELECTED FOR LOW 
REACTIVITY ^ 



.27 
.01 
.38 
■.05 
.43 



.43 
.11 
-.23 
.28 
1.29 
46 
5 



.33 
.03 
•.07 
.22 
.69 
30 
6 



.16 
.09 
.39 
•.14 
."48 
79 
8 



.45 

.10 
-.11 

.43 
1.37 
1.22 
1539. 



.39 
.07 
-.17 
.29 
1.25 
84 
12 



.46 
.04 
-.20 
.41 
1.14 
31 
5 



.35 
-.03 
-.25 

.32 
1.06 

44 
5 



.36 
.04 
-.07 
.51 
1.49. 
29 
6 



CONTINUED 



• 




UPPER QUARTILE 
MEDIAN 
JER QUARTILE 

ts 

O Ses 

ERICHE.s.) 

Immm STUDIES 



o m 


0 m 




0 m 


CO C/> 


o X 


0 X 


O X 


0 X 


H H 


2: -u 


2: 






> J> 

0 ■ cn 


h-» »— * 


ro ro 


CO CO 


UD 00 


m m 


1 1 


00 00 


1 1 


ro cn 
t 1 


£-» M 


ro ro 


CO CO 


cn cn 


ro 




cr» cn 


co ro 


CO CO 


cn ro 






CLASS 


SIZE 


• 


SERIATION 



CO 

cr 

CO 

cr 

CO 



ABILITY 



COMMUNITY 
TYPE 



.52 


-1.15 


.23 


.10 


.21 


.33 


.36 


.24 


.17 


.16 


-.01 


-.10 


-.08 


-.11 


-.15 


-.05 


.03 


-.10 


-.21 


-.20 


-.49 


-.18 


-.30 


-.56 


-.23 


-.19 


-.39 


.26 


.74 


-.11 


-.05 


.22 


.18 


.28 


.01 


-.05 


1.10 


2.04 


.42 


.23 


1.83 


1.07 


1.29 


.40 


,65 


83 


33 


12 


26 


38 


33 


44 


•39 


84 


8 


5 


5 


5 




5 


7 


4 


7 








0 0£j 











359 

CONTINUED 



10. o mm 

8.0. . 

6.0. - 

4.0.. 

3.0. . 

2.0..' 

1.5.. 

1.0.. 

• 5* • 

0 

- .5.. 

-l.a. 

-1.5.. 

-2.CT 1 
-4.0- 



o 


T3 CO 




DC 


3> 




03 


0 


T3 




> -< 


m m 7J 




I— =2 




t 


x 






-H CO 


>T|0 


m 


CO 


CO 


O 


m 


-< 


m 


H H 


H Tl Q 




<= H 




1— 


3: 


CO 


o 


m m 


H-i rn 50 




2 JO 




0 




0 


o 


X? 3 


0 73 m 




d 


§g 




CO 


TO 


2: m co 


O 


O 

i_ 








00 


m 

CO 


CO -H 

O 


ZZ CO 

1 

< 
m 


HIAL 


5~ 
So 

1 




5 





JPPER QUARTILE 

« MEDIAN 
:R QUARTILE 
ES 

O Ses 

ERJC'U-S-i 

ICL^JTUOIES 



• 2 
• 



: 2 



• 



•2 



• 2 



• 

•2 




• 



DEGREE OF INSTRUCTIONAL INSTRUCTIONAL 
GENERALITY SEQUENCING SEQUENCING UNIT 
(inW RFACTIVIT Y) 0 0" RFflrTTVTTY) 



CONTENT 
ORIENTATION 



.32 


.58 


.12 


.04 


.04 


.23 


.23 


.53 


.15 


.11 


.06 


.00 


-.19 


-.16 


.04 


.03 


-.11 


-.05 


.25 


-.12 


-.11 


-.36 


- . 33 


-.01 


-.11 


-.36 


-.20 


.08 


.68 


. .02 


.13 


.07 


.15 


. -.01 


.50 


-.05 


.58 


1.75 


.22 


1.14 


1.03 


.36 


.47 


1.93 


.37 


89 


65 


26 


45 


54 


19 


19 


58 


65 


7 


8 


6 


6 


393 6 


6 


4 


6 


5 



360 
CONTINUED 



10.0' 
8.0, 

6.0-1. 

4.0-- 

3.0. . 

2.0-. 

l.F 

l.OL 

.54- 

0 

- .5.. 

-i.ol. 

-1.5. - 

-2.(T 
-4.0- 



• 

• 



1_E 



-x- 



m 
o 



o 

m 

o 



o 

CO 

o 
c: 

00 



Co a 
>— i 
r* co 
m o 
o c: 

■H CO 

a co 
m o 



m 

c: x 



Co 



m > 
co 2: 



m S 
co 2: 

l-H 

cr 
r~ 
> 
— I 
»— t 
1 



a 

m 



5 



CO 
»— t 

3: 

»— I 
r~ 

70 



TO 

rn 

m 
o 



:r quartile 

MEDIAN 
.OWER QUARTILE 

■ ES 

■ Ses 

!0 ER J C TUDIES 



TEACHER 


MODE 


OF 


CHARACTERISTICS 


INTENT OF 


INTENT OF 


INTERACTION 


COMMUNICATING 


OF MATERIALS 


ASSESSMENT 


, ASSESSMENT 


KNOWLEDGE 






AQUISITION 


TRANSFER 


.21 .39 


.43 


.23 


.45 


.57 


.30 


.29 


.17' 


-.06 .02 


.14 


.00 


.23 


.32' . 


-.22 


.05 


-.02 


-.48 -.17 


-.01 


-.27 ■ 


-.06 


.15 


-.63 


-.20 


-.35 


-.15 .24 


.31' 


-.04 


.57 


.54 


.09 


.21 


-.11 


.53 1.00 


.64 


.42 


1.48 


.73 


1.11 


1.07 


.48 


63 95 


49 


35 


29 


23 


25 


78 


47 


5 13 


7 


4 


6 


5 


8 


17 


7 








3:H 











CONTINUED 



I 
I 



4 



10.0 
8.0. . 

6.0- . 

4.0-- 

3.0- . 

2.0. - 

1.5-- 

1.0.. 

.5-- 

0 

- .5X 

-1.0-- 

-1 .5.- 

-2.01 
-4.0+ 



UKR QUARTILE 
MEDIAN 
•R QUARTILE 
ES 
Ses 
F(E.S.) 
N! ERiC STUDIES 



-* — F 



• 
• 



7^: 
2: 
o 

r~ A 
o 

CD 

m 



o 

H 



o 
o 
m 

CO 
CO 



CO 

O P3 

r- o 
z m 

CD Z2 



ASSESSMENT DOMAIN 
OF ORIENTATION 



.22 


.16 


.14 


.36 


-.06 


-.12 


-.02 


.03 


-.24 . 


-.35 


-.34 


-.27 


.02 


-.10 


.29 


-.01 


.67 


.36 


1.57 


.57 


84 


22 


28 


34 


18 


4 


8 


8 



W6 



FIGURE 3 



BOX-AND-WHISKER PLOTS FOR DESIGN CHARACTERISTICS, OUTCOME 
CHARACTERISTICS, AND SOURCE OF EFFECT SIZE 



36: 



10.0 
8.0 

6.0-- 

4.0-- 

3.0. . 

2.0-- 

1 .5-- 

1 .0.- 

.5+ 

0 

- 54. 

-1 

-1.5.. 

-2.0* 
-4.0- 



X 

-2 — F 



: 

J_F 



«2 



•2 



•2 
#2 



•2 



*2 



ml 



•2 



• 2 



:r quartile 

MEDIAN 
[R QUARTILE 
IS 
Ses 

O ^(E.S.) 

ioERJCtudies 



I 

o 

o 
3: 



o 



O 

O 

3: 



o 
i 

5 

■2! 
O 

o 



m 
o 



0 



ASSIGNMENT 
OF SUBJECTS 



ASSIGNMENT 
OF TEACHERS 



INTERNAL 
VALIDITY 



.25 


.83 


.13 


.11 


.66 


.11 


.33 


.09 


.50 


.00 


-.05 


.16 


-.01 


.05 


.03 


.15 


-.24 


-.28 


.01 


-.20 


-.13 


.16 


.56 


-.08 


'-.13 


.35 


-.08 


.07 


.42 


.58 


.67 


.67 


.53 


.64 


.05 


52 


36 


52 


46 


83 


57 


22 


7 


4 


7 


5 


9 


:m 9 


5 















363 
CONTINUED 



3 lf 



10.0 
8.0, 

6.O.. 

4.O.. 

3.0. . 

2.0- - 

1 .5-. 

1.0.. 



0 

- 5J. 

-1.0.. 

■1.5.. 

-2.Cf 
•4.<* 



JPPER QUARTILE 

§ MEDIAN 
R QUARTILE 
ES 

O Ses 
rn i/^(E.S. ) 

ic££ALtudies 



«2 



•2 



♦2 



JF 



• 2 



•2 



f2 



m-o 
>< cz 

TD 

m c/> 
m 



m 
x 
•u 
m 

73 



m 



CO 


"T| 


O CO 2 




r~ 


IE* 


mnm 




o 


O 


< j> > 


CD 


o 


-J 


HH Z Z 




7C 


o 


>DC0 




1—4 


73 






Z2. 








CD 


AL 


0 0 

00 ' 





TYPE OF 
STUDY 



EXPERIMENTAL 
DESIGN 



SOURCE OF 
EFFECT SIZE 



.15 


.57 


.12 


.43 


.46 


.40 




.01 


.16 


.01 


.14 


.06 


.16 




.22 


.00 


-.52 


-.04'- 


-.14 


.05 




.07 


.31 


-.28 


.20 


.17 


.16 




.67 


.52 


.76 


.54 


.56 


.77 , 




51 


89 


23 


55 


107 


3a . 




07 


12 


05 


08 


11 


09 


3S7 



FIGURE 4 




BOX-AND-WHISKER PLOTS FOP. STUDENT CHARACTERISTICS, CONTEXT 
CHARACTERISTICS, TREATMENT CHARACTERISTICS, AND OUTCOME 
CHARACTERISTICS - - • 



364 



JS-I 



upIer QUARTILE 
MEDIAN 
L(lER QUARTILE 
■ ES 

Ses 

CD • S • ) 

NtJML STUDIES 



»2 



•2 



4> 


■j 




CO 


CO 


c: 


1 


O 


CO 


— ( 








1 


i 














o 


o 


> 








m 


m 










♦ 


* 





TO 
CZ, 

.38 



CO 

c: 

DO 
C 

DO 
> 





GRADE 
LEVEL 




SERIATION 
ABILITY 




COMMUNITY 
TYPE 




.15 - 


.23 


.29 


.68 


.38 


.66 


.33 


.15 


.06 


-.03 


.11 


.05 


. -'.04 


.34 


.11 


.05 


.10 


-.19 


.02 


-.17- 


-.22 


.'00 


-.04 - 


.09 


.06 


.02 




.26 " 


.01 


.37 


.18 


.01 


.31 


.60 




.59 


.61 


.57 


.33 


.65 


46 


19 


44 


15 


17 


53- 


15 


64 


04 


05 


06 


• 05 


05 


05 


04 


08 



393 



365 

CONTINUED 



10.0 
8.0. „ 

6.C 

4.oX 

3.0« 

-2.0+ 

1.5.. 

1.0,. 

.5+ 

0 

- .5J» 
-1.0.. 
-1.5.. 

-2.o;* 

-4.0 



R QUAPJILE 
MEDIAN 
R QUARTILE_ 
ES 
Ses 

TUDIES 



l ER l c 



• 



ic 



1 r 



•2 



•2 



•2 



•2 



30 

I— H 

H 

m 



m 

DC 



m 

x 



"D m 
c: x 
r- — i 
> 

m 5> 



m 



o 
> 



in 



7*Z 
O 



O 

CD 



> 



O 

5 



STYLE OF 
ADVANCE 
ORGANIZER 



CHARACTERISTICS 
OF MATERIALS 



INTENT OF 
ASSESSMENT: 
AQIITSTTinN 



ASSESSMENT 

DOMAIN OF 
0RTFNTA1 TON 



.24 


.23 


.24 


.10 


.46 


.38 


.25 


1.13 


.02 


.05 


.09 


" .05 


.16 


.07 


.07 


.68 


-.23 


-.17 


-.22 


-.03 


-.07 


-.03 


-.09 


.47 


-.09 


.02 


-:oi 


.01 


.18 


.12 


.09 


.77 


.77 


.61 


.77 


.20 • 


.89 


.52 


.59 


.47 


24 


25 


35 


41 


34" 


50 


118 


17 


05 


04 


04 


04 


09 


11 


17 


05 



366 

* r 

A comparison between the effect sizes for those studies which had 
high internal validity as opposed to medium internal validity showed little 
difference. In addition, an examination of effect sizes selected for 
source of effect size data showed little difference. However, there were 
differences detected between those studies using matched sample and those 
having random selection or Intact groups. The .effect sizes for those 
studies using matching techniques were in general .41 standard deviations 
higher. 

While effect sizes vary somewhat with respect to the selection of 
teachers, the type of study, and the experimental design, these differences 
are. not very substantial. It was observed that 75% of the effect sizes for 
studies using non-random assignment of teachers to treatment were greater 
than 50% of those for studies using random assignment. In the case of type 
of study 75% of the effect sizes for experimental studies. were greater than 

50% of those for the quasi-experimental studies. Most of the effect sizes 

c 

for those studies using a factorial design were greater than these from 
studies using a block design. These differences suggest that different 
conclusions may be drawn from the study characteristic analysis depending 
upon the design characteristics. However, a further analysis of study 
characteristics indicated that there was insufficient data for a separate 
analysis based upon the selection of specified design characteristics. 

A summary analysis of the advance organizer data base across all 
study characteristics resulted in finding a median effect size of .09>and 
hinges of .43 and -.07. Thus, the average student experiencing an advance 
organizer preinstructional strategy did better than only 54% of the control 
group. However, the spread between hinges indicates that 75% of those 
having advance organizers did better than 47% of the control group. The 
following narrative will describe any differences in effect size based on 



367 



particular study characteristics. This description will show the circum- 
stances under which advance organizers could b,e expected to provide a more 
effective educational approach. 

The frequencies analysis of study characteristics resulted in the 
selection of 72 variables which had adequate data for.further study. A 
crosstabulation analysis found 24 variables which had sufficient data for 
possible comparative analysis when 10 effect ,>izes in a cross-tab cell was 
set as the minimum required. The results of this analysis in which 7 
variables were found to meet the criteria of 4 or more studies are shown 
in Figure fyjL 



There is an indication that variation in grade level or student 
seriation ability makes little difference* However, effect sizes did 
differ depending upon community context. Those studies conducted in a 
suburban environment had effect sizes which were in general lower than 
those from studies in urban contexts. The average experimental subject 
from a suburban context was only greater than 52% of the control group 
while the average subject in an urban environment scored above 63% of the 
control group. Moreover, 75% of the effect sizes from the suburban studies 
were lower than 50% of those from rural studies. 

Enough data for analysis was found for only two treatment 
characteristics; the style of advance organizer and the characteristic of 
materials- In each case there was little difference between the advance 
organizer groups and the control groups. The effect sizes for studies 
using written or verbal advance organizers were similar as were the effect 
sizes for the Studies which used only textbooks or those havint textual as ^ 
well as manipulative materials. 

Only two outcome characteristics had adequate data for analysis. 
Little difference in effect sizes was found between those studies evaluat- 
ing student performance on identical information as opposed to similar 
information. There was, however, a difference between performance on 
knowledge oriented instruments and application instruments. The performance 
of the average subject on application was better than 68% of the control 
group while the performance of the treatment group for comprehension items 
was better than only 46% of the control group. It should be pointed out 
however that the application analysis was based upon 17 effect sizes from 
only five studies while the analysis of the comprehension data was based 
upon 118 effect sizes from seventeen studies. 



369 , 

Imp! ications . The results of this analysis pertaining to the 
effect uf advance organizers upcn student outcomes has provided needed 
information for establishing directions for future research. In addition, 
it has influenced the formulation of conjectures concerning the effective- 
ness of advance organizers. 

The data analysis seems to indicate that advance organizers have 
been advantageous in the urban setting than in rural or suburban contexts. 
There seems to be little effect depending upon grade level, style of 
organizer, or characteristics of materials. However, as. noted above there 
has been little variation in treatment or outcome characteristics across 
studies. It would be useful for future studies to break out of the past 
advance organizer research pattern and use as yet infrequently applied 
characteristics. 

A further exploration of outcome distinctions is necessary where 
such features as. transfer and application are used. Very little advance 
organizer research has used application items for the assessment of 
performance or understanding. Future research should address this question 
of subject ability to apply what has been taught. Another aspect which 
should be considered is the extension of study duration and in particular 
the number of sessions. Moreover, due to the results indicating design 
characteristic differences in effect it may be appropriate to utilize more 
experimental and factorial designs utilizing where possible a matching 
technique. 

It would also be worthwhile to compare variations in type of 
advance organizer across other characteristics to determine any distinct 
effects based upon type of advance organizer. The data in this study was 
too limited to pursue that analysis. 



CONCLUSIONS AND RECOMMENDATIONS 

This meta-analysis may provide a foundation for the continued 
exploration of learning and teaching in science education. Some research 
areas should receive more attention > especially the level of inquiry under 
different curricular treatments. This aspect. of curricular variation has 
not, as evidenced by this meta-analysis, been subjected to any extensive 
analysis. The duration of experimental studies should be extended and 
the collection of qualitative datashould be increased for both quasi- and 
experimental studies. The pursuit of these suggestions should assist the 
research community in better articulating any distinctions which exist 
between treatments. 

The most limiting aspect of this study has been the lack of 
descriptive information in studies coded. In addition, many studies were 
not cod^d due to insufficient reporting of descriptive or analysis statistics. 
It is hoped that the coding variables formulated for this study can provide 
a beginning framework for the design and communication of research character- 
istics in future studies. The Ictck of descriptive information, in addition 
to the limited number of studies coded, resulted in the inability to explore 
complex interactions and the effect of confounding variables not addressed in 
individual studies. 

The next step should be, in addition to the more complete and 
thorough description of studies, the continued coding of studies not 
included in this analysis in order to extend the database such that further 
analysis can be undertaken. It is then, with an expanded data base, that 
the technique of meta-analysis can be used to its fullest potential. In 



addition, this continued coding could lead to the inclusion of research 
areas as yet not analyzed. These include behavioral objectives, kinetic 
structure, ma thema genie behaviors, scope of content and the integrated 
curriculum hypothesis, and the organization of curriculum. 



372 



REFERENCES 



Glass, G.V., McGaw, B., and Smith, M.L. Meta-Analysis in 
Social Research . Beverly Hills, California : Sage 
Publications, 1981. 

Hanna, J.F. V A New Approach to the Formulation and Testing 
of Learning Models. 11 Synthesis , 1966 , 16 344-380. 

Hanna, J.R. Explanation, Prediction, Description, and Information 
Theory. Synthesis . 1969 , 20 , 308-334. 

Popper, K.R. Conjectures and Refutations: The Growth of 
Scientific Knowledge. London: Routledge and Kegan , 
1962. 

Yeany, R.H. , Jr. Applying and Interpreting the Results of 
Meta-Analysis Procedures. Paper presented at the ^ 
meeting of the National Association for Research in 
Science Teaching, Boston, April 1980. 



